MPEG Compression of 2-D Data Files
September 16th, 2002 by Roman Zaritski in
In the course of my research I have to deal with dozens of Gigabytes of data produced by various 2-D computer simulations. Typically, such data comes from a rectangular grid and is stored in a file in raw byte format as a sequence of ''frames'' for consecutive time steps. Each frame represents a 2-D array and can be visualized on a computer screen by showing its values as pixel intensities. Then, the entire data file can be viewed as a movie. There are two practical problems with movie files in raw data format.
First, I am not aware of any free or open-source players for this format. Commercial players are available for a limited number of platforms, but typically they have too many unnecessary features and are expensive. I wrote my own simple Mesa 3-D-based player for Linux, but it is slow, and sharing my data files with people who work on other platforms is a painful experience.
The second problem is the enormous size of my movie files. I have to compress the files with gzip every time I want to store them or send them to my colleagues over the Internet, and then I have to uncompress them every time I want to play them.
I solved both the portability and compression problems by converting my data files into MPEG video format. It took me a while to figure out how to do this with the free, open-source tools available for Linux, and now I am happy to share this with you.
The first step is to install Mjpegtools, a package that contains an MPEG encoder. Although some versions are available at rpmfind.net, I downloaded the latest source code for mjpegtools-1.6.0 directly from mjpeg.sourceforge.net as a gzipped tarball (about 1MB). Then I did tar -xzvf mjpegtools-1.6.0.tar.gz; cd mjpegtools-1.6.0 on my Red Hat 6.2 Pentium box, but I should have read the prerequisites in the INSTALL file first. I ignored them initially and went on to ./configure and make, which resulted in a successful compilation but led to crashes at runtime. I had to go back, install the assembler package, nasm-0.98-2m, from rpmfind and then re-issue ./configure and make.
Mjpegtools cannot accept my raw data files directly, but it can accept a similar format, a PPM (Portable Pixel Map) stream. PPM is a still-image format that consists of a simple header and a sequence of raw bytes encoding the red, green and blue components of each pixel in the image. It is one of the simplest color image formats available (see the ppm man page or netpbm.sourceforge.net/doc/ppm.html for details). A PPM stream is obtained when the contents of several PPM image files are placed one after another without any special separators. I wrote a simple C++ program, sim2ppm, that converts my raw movie file into a PPM stream and sends it to the standard output. Sim2ppm adds an appropriate PPM header at the beginning of each frame and, based on a color map of my choice, splits each data byte into three-color component bytes.
Assuming that my original 2-D simulation data file movie.sim and my sim2ppm binary are in the current directory mjpegtools-1.6.0/, here is how I convert it to the MPEG format:
sim2ppm movie.sim | lavtools/ppmtoy4m | mpeg2enc/mpeg2enc -F 2 -q 10 -a 1 -o movie.mpg
The ppmtoy4m filter converts a PPM stream into a YUV stream, which is a required input format of mpeg2enc, the actual MPEG encoder. Explanation of the command-line options can be found at the mpeg2enc man page that comes with the distribution. This command produces a compressed MPEG file, movie.mpg.
The produced file can be played with any MPEG player. Depending on the degree of compression and type of data, I usually get good quality movies that are over 200 times smaller in size compared to their raw originals, i.e., a Gigabyte of raw data may shrink to only a few Megabytes. Such an MPEG file can be placed on the Web for quick downloads. It can be played directly from most browsers on most operating systems.
There are two popular MPEG players freely available for Linux: mpeg_play and xanim. They already may be included in your Linux distribution or can be found at rpmfind.net. Xanim can play a variety of video formats, but it skips certain MPEG frames (shows only I frames). This is why I use mpeg_play to view my movie files. For example, typing mpeg_play -framerate 5 movie.mpg & will play our movie at a rate of five frames per second.
RPMfind, a large on-line database of Linux packages.
Roman Zaritski is an Assistant Professor of Computer Science at Montclair State University in New Jersey. His interests include numerical modeling and cluster computing.
Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Subscribe now!
The Latest
Newsletter
Tech Tip Videos
- Jul-01-09
- Jun-29-09
Recently Popular
From the Magazine
July 2009, #183
News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.
To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .
All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.
Delicious
Digg
StumbleUpon
Reddit
Facebook








Post new comment