MPEG Compression of 2-D Data Files

September 16th, 2002 by Roman Zaritski in

Roman explains how he solved his portability and compression problems by converting his data files into MPEG video format.
Your rating: None

In the course of my research I have to deal with dozens of Gigabytes of data produced by various 2-D computer simulations. Typically, such data comes from a rectangular grid and is stored in a file in raw byte format as a sequence of ''frames'' for consecutive time steps. Each frame represents a 2-D array and can be visualized on a computer screen by showing its values as pixel intensities. Then, the entire data file can be viewed as a movie. There are two practical problems with movie files in raw data format.

First, I am not aware of any free or open-source players for this format. Commercial players are available for a limited number of platforms, but typically they have too many unnecessary features and are expensive. I wrote my own simple Mesa 3-D-based player for Linux, but it is slow, and sharing my data files with people who work on other platforms is a painful experience.

The second problem is the enormous size of my movie files. I have to compress the files with gzip every time I want to store them or send them to my colleagues over the Internet, and then I have to uncompress them every time I want to play them.

I solved both the portability and compression problems by converting my data files into MPEG video format. It took me a while to figure out how to do this with the free, open-source tools available for Linux, and now I am happy to share this with you.

Encoding with Mjpegtools

The first step is to install Mjpegtools, a package that contains an MPEG encoder. Although some versions are available at rpmfind.net, I downloaded the latest source code for mjpegtools-1.6.0 directly from mjpeg.sourceforge.net as a gzipped tarball (about 1MB). Then I did tar -xzvf mjpegtools-1.6.0.tar.gz; cd mjpegtools-1.6.0 on my Red Hat 6.2 Pentium box, but I should have read the prerequisites in the INSTALL file first. I ignored them initially and went on to ./configure and make, which resulted in a successful compilation but led to crashes at runtime. I had to go back, install the assembler package, nasm-0.98-2m, from rpmfind and then re-issue ./configure and make.

Mjpegtools cannot accept my raw data files directly, but it can accept a similar format, a PPM (Portable Pixel Map) stream. PPM is a still-image format that consists of a simple header and a sequence of raw bytes encoding the red, green and blue components of each pixel in the image. It is one of the simplest color image formats available (see the ppm man page or netpbm.sourceforge.net/doc/ppm.html for details). A PPM stream is obtained when the contents of several PPM image files are placed one after another without any special separators. I wrote a simple C++ program, sim2ppm, that converts my raw movie file into a PPM stream and sends it to the standard output. Sim2ppm adds an appropriate PPM header at the beginning of each frame and, based on a color map of my choice, splits each data byte into three-color component bytes.

Assuming that my original 2-D simulation data file movie.sim and my sim2ppm binary are in the current directory mjpegtools-1.6.0/, here is how I convert it to the MPEG format:

   sim2ppm movie.sim | lavtools/ppmtoy4m |
   mpeg2enc/mpeg2enc -F 2 -q 10 -a 1 -o movie.mpg

The ppmtoy4m filter converts a PPM stream into a YUV stream, which is a required input format of mpeg2enc, the actual MPEG encoder. Explanation of the command-line options can be found at the mpeg2enc man page that comes with the distribution. This command produces a compressed MPEG file, movie.mpg.

Playing It Back

The produced file can be played with any MPEG player. Depending on the degree of compression and type of data, I usually get good quality movies that are over 200 times smaller in size compared to their raw originals, i.e., a Gigabyte of raw data may shrink to only a few Megabytes. Such an MPEG file can be placed on the Web for quick downloads. It can be played directly from most browsers on most operating systems.

There are two popular MPEG players freely available for Linux: mpeg_play and xanim. They already may be included in your Linux distribution or can be found at rpmfind.net. Xanim can play a variety of video formats, but it skips certain MPEG frames (shows only I frames). This is why I use mpeg_play to view my movie files. For example, typing mpeg_play -framerate 5 movie.mpg & will play our movie at a rate of five frames per second.

Resources

RPMfind, a large on-line database of Linux packages.

Mjpegtools

PPM man page

Mpeg_play

Xanim

Roman Zaritski is an Assistant Professor of Computer Science at Montclair State University in New Jersey. His interests include numerical modeling and cluster computing.

__________________________


Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Post new comment

Please note that comments may not appear immediately, so there is no need to repost your comment.
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.

More information about formatting options

Newsletter

Each week Linux Journal editors will tell you what's hot in the world of Linux. You will receive late breaking news, technical tips and tricks, and links to in-depth stories featured on www.linuxjournal.com.
Sign up for our Email Newsletter

Tech Tip Videos

From the Magazine

July 2009, #183

News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.


To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .


All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.





Read this issue