DVD Transcoding with Linux Metacomputing
DVD is based on a subset of standards ISO/IEC 11172 (MPEG-1) and ISO/IEC 13818 (MPEG-2). A DVD movie is divided into three parts: video objects (VOBs) files with a maximum size of 1GB each, multiplexing video and audio sources.
Three types of MPEG-2 frames exist: I (Intra), P (Predictive) and B (Bidirectionally-predictive). I frames represent full images, while P and B frames encode differences between previous and/or future frames. In principle, it seems obvious that video stream cuts must be located at the beginning of I frames. This is almost right, but not quite. Some parameters, such as frame rate and size, must be taken into account. This information is part of the Sequence Header. For this reason, packets chosen as cut points must have a Sequence Header. Fortunately, there is a Sequence Header before every I frame.
Another important issue is frame reordering due to the existence of P and B frames. After an I frame, B frames may follow that depend on P frames that came prior to the I frame. If the video stream is partitioned at the start of that I frame, it is not possible to maintain video transcoding consistency. The solution consists of assigning the late B frames to the previous chunk. As a consequence, a little extra complexity is added to video preprocessing.
Obviously, it is not interesting to fragment video to the maximum extent, because the size of the chunks would be too small. Typically, about 300KB exist between two consecutive I frames, although this length depends on several parameters, such as bit rate or image size.
We considered two basic load balancing strategies for our project. In the first, called Small-Chunks, the DVD movie is divided into small chunks of a fixed size. Condor assigns a chunk to every available computer. When a computer finishes transcoding one chunk, it requests another one. This process is repeated until there are no more chunks left on the server. In the other strategy, called Master-Worker, load balancing depends on the shares, which are determined by the master processor. Obviously, the other computers involved are the workers. This strategy often is used for high-throughput computations. For this project, chunk size for each particular computer is assigned according to a training stage, as explained in the next section.
It should be understood that we deliberately do not consider the possibility of machine failures or user interference. If those events take place, the performance of the simple Master-Worker implementation in this project would drop. Nevertheless, our two approaches are illustrative because they are extreme cases, pure Master-Worker on one hand and the high granularity of Small-Chunks on the other. Fault/interference-tolerant Master-Worker strategies lie in the middle. Our aim is to evaluate whether the behavior of our application is similar in the two extremes, in terms of processing time and transcoded file size. As the results described in this article suggest, Small-Chunks may be more advantageous due to its simplicity (it does not need a training stage) and because it adapts naturally to Condor's management of machine unavailability.
To provide information to the Master-Worker coordinator, it is necessary to evaluate all computers beforehand. Evaluation is performed in a training stage, which estimates the transcoding rate of each computer in frames per second. The training stage of our prototype consists of transcoding a variety of small video sequences in the target computer set and estimating the average frames per second delivered by each computer. This result then is used to set the sizes of the data chunks, which are proportional to the estimated performance of each computer. Ideally, this approach minimizes DVD transcoding time, because all computers should finish their jobs at the same time.
Our testbed emulated a typical heterogeneous computing environment, including machines at the end of their usage lives. It was composed of five computers (see Table 1), classified in three groups according to their processing capabilities. Two machines were in the first group (gigabyte and kilobyte), a single computer was in the second group (nazgul) and two machines with the worst performance (titan and brio) were in the third group.
Table 1. Test Bed Computers
|gigabyte||Intel Pentium 4||1,700||256 DDR||528,205||1,388|
|kilobyte||Intel Pentium 4||1,700||256 DDR||624,242||1,355|
|titan||Intel Pentium II||350||320||67,987||398|
|brio||Intel Pentium II||350||192||72,281||398|
In addition, all computers were linked to a 100Mbps Ethernet network, and the operating system used in all computers was Red Hat Linux 8.0. All computers shared the same user space, defined by an NIS server, and the same filesystem (NFS server in gigabyte). Finally, we installed Condor v. 6.4.7, and gigabyte was the central manager. Condor was configured to keep all jobs in their respective processors regardless of user activity. Thus, the timings in this section are best-case results, as mentioned above.
The DVD-to-DivX parallel transcoder was implemented with the following libraries:
libmpeg2 0.3.1: DVD MPEG-2 stream demultiplexing and decoding.
liba52 0.7.5-cvs: DVD AC3 audio decoding.
DivX 5.0.1alpha: MPEG-4 video encoding.
lame 3.93.1: MP3 audio encoding.
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- Non-Linux FOSS: libnotify, OS X Style
- UX Designer
- Android's Limits
- Reachli - Amplifying your
37 min 31 sec ago
1 hour 26 min ago
- good point!
1 hour 29 min ago
- Varnish works!
1 hour 38 min ago
- Reply to comment | Linux Journal
2 hours 7 min ago
- Reply to comment | Linux Journal
4 hours 33 min ago
- Reply to comment | Linux Journal
8 hours 33 min ago
- Yeah, user namespaces are
9 hours 49 min ago
- Cari Uang
13 hours 21 min ago
- user namespaces
16 hours 14 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?