Advanced Video Coding on Linux
It is nice that QuickTime 7 supports H.264-encoded video. Apple itself encodes all of its movie trailers on-line using H.264. Although this is good, and fosters the adoption of this codec, the QuickTime implementation has some limitations, most notably with B-Frames and Profile support. We need a short detour to explain what this means for our encoding project.
The MPEG standard for H.264 includes a number of profiles, such as Baseline, Main, Extended and High. These profiles delineate different technical capabilities that a decoder may need to possess. As its name suggests, the Baseline profile is the simplest and least-demanding profile, and Main, Extended and High require more processing power and the interpretation of more technical features in order to decode properly. QuickTime 7 supports Baseline and parts of the Main profiles; however, it chokes on features of the Extended and High profiles.
B-Frames are a type of storage format for digital video. These types of frames reference information from other previously decoded frames in order for the decoder to do its job properly, which is to decode the video. B-Frames are interleaved amongst other frame types known as I-Frames and P-Frames. It's a technical detail, but the QuickTime 7 H.264 decoder can support only up to two B-Frames, no more. This is unfortunate, because using more B-Frames would let us increase quality under some circumstances.
To remain QuickTime-compatible, we need to keep these limitations in mind. However, the quality of our low-bitrate encoding will not really suffer that much, even with these limitations. And, we can enable a few additional options to improve things quite a bit. The first is the subpixel motion estimation (--subme) size, which controls the precision of motion estimation calculations used by x264 during the encoding process. By increasing this to 6, the maximum, we gain a lot of visual quality at the cost of some additional encoding time, but it is worth it. We also can configure how x264 analyzes frames to perform better motion estimation (--analyse), which leads to higher-quality encodes. Note that some types of analysis are for High profile encodings only, such as 8x8 DCT, which are not supported by QuickTime, so we avoid those settings. We also can disable PSNR calculations (--no-psnr) to buy back a little speed during the encode. PSNR is simply a quality measurement and has no effect on the actual encoding quality.
Putting all this together, we can now output a high-quality, low-bitrate, QuickTime-compatible and standards-compliant video encoding using H.264:
mkfifo tmp.fifo.yuv mencoder -vf format=i420 -nosound -ovc raw -of \ rawvideo -ofps 23.976 -o tmp.fifo.yuv \ max.dv 2>&1 > /dev/null & x264 -o max-video.mp4 --fps 23.976 --bframes 2 \ --progress --crf 26 --subme 6 --analyse \ p8x8,b8x8,i4x4,p4x4 --no-psnr tmp.fifo.yuv 720x480 rm tmp.fifo.yuv
We can make further improvements. Because this video file is destined for the Web, we most likely want to reduce the frame size to something more friendly, possibly crop out unwanted areas, and make other adjustments. For example, to reduce the frame size, run the following commands:
mkfifo tmp.fifo.yuv mencoder -vf scale=480:320,format=i420 -nosound -ovc \ raw -of rawvideo -ofps 23.976 -o tmp.fifo.yuv \ max.dv 2>&1 > /dev/null & x264 -o max-video.mp4 --fps 23.976 --bframes 2 \ --progress --crf 26 --subme 6 --analyse \ p8x8,b8x8,i4x4,p4x4 --no-psnr tmp.fifo.yuv 480x320 rm tmp.fifo.yuv
Here we instruct mencoder to scale the output to 480x320 and also tell x264 to accept that frame size. This will further reduce the file size, which is appropriate for video on the Web.
Based on the QuickTime format, the .mp4 container format can store many types of media and is also the MPEG standard for storing H.264 video and AAC audio, which is how we will be using it. Use MP4Box, which is part of the gpac project, to combine the audio and video streams we've just created:
MP4Box -add max-video.mp4 -add audiodump.aac \ -fps 23.976 max-x264.mp4
This produces the final output file max-x264.mp4. You can play back the file with MPlayer, or with Apple's QuickTime player on a non-Linux OS. You also can embed this file into a Web page for playback from a browser by using Apple's instructions for embedding QuickTime movies (see Resources). Free software tools such as the mplayer-plugin can be used to play this file from within Firefox on Linux.
By way of comparison, here are the file sizes and bitrates of the original raw DV file max.dv, our H.264-encoded file max-x264.mp4 and a corresponding XviD encoding max-xvid.avi, which was created from the same source video (see Resources):
mencoder max.dv -vf scale=480:320 -ovc xvid -xvidencopts \ fixed_quant=7:qpel:nopacked -oac mp3lame \ -ofps 24000/1001 -o max-xvid.avi
Table 1. Table of Results
|File||File Size||Video Bitrate|
And here are accompanying screenshots for each sample.
As you can see, the visual quality of the H.264-encoded file is just as high as the XviD version, arguably higher, but at a lower bitrate and file size. This shows that you can achieve similar results in less space, or much better results in the same space, with H.264 compared to other codecs such as XviD. In addition, the work flow and options for encoding with x264 are similar to XviD, but with greatly improved output. So, if you are used to encoding with XviD, many of the concepts and options should be familiar to you when working with x264.
The more you experiment with x264, the more you will discover the amazing savings in bitrates and file sizes while still maintaining an extremely high visual quality. The world of video encoding is definitely a black art, as there are hundreds of variables and options that can be brought to bear in any particular encoding project. There is no one-size-fits-all method of video encoding. However, the technical superiority of H.264 over XviD or regular MPEG-2-encoded video is too great not to take advantage of it. And, you can start taking advantage of it today, using the tools described above. Because H.264 is an MPEG-standard encoding, used with an MPEG-standard audio codec inside of an MPEG-standard container format, all the work you invest in using these tools to encode your video will be future-proof as well as high-quality. Use the techniques outlined above as a starting point for your own H.264-encoding projects, and you'll discover why H.264 is becoming the next standard for video encoding.
Resources for this article: /article/9197.
Dave Berton is a professional programmer. He can be contacted at email@example.com.
- High-Availability Storage with HA-LVM
- DNSMasq, the Pint-Sized Super Dæmon!
- March 2015 Issue of Linux Journal: System Administration
- Localhost DNS Cache
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- Days Between Dates: the Counting
- The Usability of GNOME
- PostgreSQL, the NoSQL Database
- Linux for Astronomers
- You're the Boss with UBOS