Image Processing with QccPack and Python
Limited bandwidth and storage space are always a challenge. Data compression is often the best solution. When it comes to image processing, compression techniques are divided into two types: lossless and lossy data compression.
QccPack, developed by James Fowler, is an open-source collection of library routines and utility programs for quantization and reliable implementation of common compression techniques.
Libraries written for QccPack have a clean interface. So far, these libraries can be upgraded without having to modify the application code. QccPack consists of a static-linked library, libQccPack.a, and supports dynamic linking with libQccPack.so.
Entropy coding, wavelet transforms, wavelet-based sub-band coding, error coding, image processing and implementations of general routines can be done through the library routines available with QccPack. Optional modules are available for the QccPack library that you can add later. QccPackSPIHT is one optional module for the QccPack library that provides an implementation of the Set Partitioning in Hierarchical Trees (SPIHT) algorithm for image compression. The QccPackSPIHT module includes two utility executables, spihtencode and spihtdecode, to perform SPIHT encoding and decoding for grayscale images.
QccPack and QccPackSPIHT are available for download from the QccPack Web page on SourceForge. Red Hat users can find source and binary RPMs at that Web site. Users of other systems will need to compile the source code. QccPack has been complied successfully on Solaris/SPARC, Irix, HP-UX, Digital UNIX Alpha and Digital RISC/Ultrix.
You can use QccPack to train a VQ codebook on an image and then to code the image with full-search VQ followed with arithmetic coding. Take a 512*512 grayscale Lenna image, for example. The following sample procedure assumes you are at the Python interpreter prompt.
Step 1: convert from the PGM image file format to the DAT format file by extracting four-dimensional (2x2) vectors of pixels:
>>> imgtodat-ts 4 lenna.pgm.gz lenna.4D.dat.gz
Step 2: train a 256-codeword VQ codebook on the DAT file with GLA (stopping threshold = 0.01):
>>> gla -s 256 -t 0.01 lenna.4D.dat.gz lenna.4D256.cbk
Step 3: vector quantize the DAT file to produce a channel of VQ indices:
>>> vqencode lenna.4D.dat.gz lenna.4D256.cbk lenna.vq.4D256.chn
Step 4: calculate first-order entropy of VQ indices (as bits/pixel):
>>> chnentropy -d 4 lenna.vq.4D256.chn First-order entropy ↪of channel lenna.vq.4D256.chn is: 1.852505 (bits/symbol)
Step 5: arithmetic-encode channel of VQ indices:
>>> chnarithmeticencode -d 4 lenna.vq.4D256.chn ↪lenna.vq.4D256.chn.ac
Channel lenna.vq.4D256.chn arithmetic coded to: 1.830322 (bits/symbol):
>>> rm lenna.vq.4D256.chn
Step 6: decode arithmetic-coded channel:
>>> chnarithmeticdecode lenna.vq.4D256.chn.ac lenna.vq.4D256.chn
Step 7: inverse VQ channel to produce quantized data:
>>> vqdecode lenna.vq.4D256.chn lenna.4D256.cbk ↪lenna.vq.4D256.dat.gz
Step 8: convert from DAT to PGM format:
>>> dattoimg 512 512 lenna.vq.4D256.dat.gz lenna.vq.4D256.pgm
Step 9: calculate distortion between original and coded images:
>>> imgdist lenna.pgm.gz lenna.vq.4D2 56.pgm
The distortion between files lenna.pgm.gz and lenna.vq.4D256.pgm is:
22.186606 dB (SNR)
36.719100 dB (PSNR)
The Python Imaging Library adds image processing capabilities to the Python interpreter. This library provides extensive file format support, an efficient internal representation and fairly powerful image processing capabilities. The core image library is designed for fast access to data stored in a few basic pixel formats. The library contains some basic image processing functionality, including point operations, filtering with a set of built-in convolution kernels and color space conversions. The Python Imaging Library is ideal for image archival and batch processing applications. You can use the library to create thumbnails, convert between file formats and print images. The library also supports image resizing, rotation and arbitrary affine transforms.
The Python Imaging Library uses a plugin model that allows you to add your own decoders to the library, without any changes to the library itself. These plugins have names such as XxxImagePlugin.py, where Xxx is a unique format name (usually an abbreviation).
Python, xv and the PIL package are essential for Python image processing programming. Run these commands to build PIL in Linux:
python setup.py build_ext -i python selftest.py
The most important class in the Python Imaging Library is the Image class, defined in the module with the same name. We create instances of this class in several ways: by loading images from files, processing other images or creating images from scratch.
To load an image from a file, use the open function in the Image module:
>>> import Image >>> im = Image. open ("lenna.ppm")
The Python Imaging Library supports a wide variety of image file formats. The library automatically determines the format based on the contents of the file or the extension.
Listing 1. Convert Files to JPEG
import os, sys import Image for infile in sys.argv[1:]: outfile = os.path.splitext(infile) + ".jpg" if infile != outfile: try: Image.open(infile).save(outfile) except IOError: print "cannot convert", infile
The next example (Listing 2) shows how the Image class contains methods to resize and rotate an image.
Listing 2. Simple Geometry Transforms
out = im.resize((128, 128)) out = im.rotate(45) out = im.transpose(Image.ROTATE_90)
The Python Imaging Library allows you to convert images between different pixel representations using the convert function—for example, converting between modes:
im = Image.open("lenna.ppm").convert ("L")
The library supports transformations between each supported mode and the L and RGB modes. To convert between other modes, you may have to use an intermediate image.
The ImageFilter module contains a number of predefined enhancement filters that can be used with the filter method. For example, from the Python prompt, do the following:
>>> import ImageFilter >>> out = im.filter(ImageFilter.DETAIL)
Once you have imported the module, you can use any of these filters:
Some decoders allow you to manipulate an image while reading it from a file. This often can be used to speed up decoding when creating thumbnails and printing to a monochrome laser printer. The draft method manipulates an opened but not yet loaded image so it matches the given mode and size as closely as possible. Reconfiguring the image decoder does this. See Listing 3 for an example of how to read an image in draft mode.
Listing 3. Reading in Draft Mode
im = Image.open (file) print "original =", im.mode, im.size im.draft("L", (100, 100)) print "draft =", im.mode, im.size This prints something like: original = RGB (512, 512) draft = L (128, 128)
Listing 4 shows how the ImageDraw module provides basic graphics support for Image objects.
Listing 4. Draw a Gray Cross over an Image
import Image, ImageDraw im = Image.open("lenna.pgm") draw = ImageDraw.Draw(im) draw.line((0, 0) + im.size, fill=128) draw.line ((0, im.size, im.size, 0), fill=128) del draw im.save(sys.stdout, "PNG")
The pildriver tool gives you access to most PIL functions from your operating system's command-line interface. When called as a script, the command-line arguments are passed to a PILDriver instance. If there are no command-line arguments, the module runs an interactive interpreter, each line of which is split into space-separated tokens and passed to the execute method. The pildriver tool was contributed by Eric S. Raymond. The following commands are from the Python prompt:
>>> pildriver program >>> pildriver show crop 0 0 200 300 open test.png >>> pildriver save rotated.png rotate 30 open test.tiff
The pildriver module provides a single class called PILDriver. An instance of the PILDriver class is essentially a software stack machine (Polish-notation interpreter) for sequencing PIL image transformations. The state of the instance is the interpreter stack. The only method one normally will invoke after initialization is the execute method. This takes an argument list of tokens, pushes them onto the instance's stack, and then tries to clear the stack by successive evaluation of PILdriver operators. Any part of the stack not cleaned off persists and is part of the evaluation context for the next call of the execute method. PILDriver doesn't catch any exceptions on the theory that these actually contain diagnostic information that should be interpreted by the calling code.
The pilconvert tool converts an image from one format to another. The output format is determined by the target extension, unless explicitly specified with the -c option:
>>> pilconvert lenna.tif lena.png >>> pilconvert -c JPEG lenna.tif lena.tmp
The SDC Morphology Toolbox for Python is software used for image analysis and signal processing. It is based on the principle of discrete nonlinear filters followed by lattice operations. These filters are called morphological operators. Morphological operators are useful for restoration, segmentation and quantitative analysis of images and signals. SDC Morphology is effectively useful for machine vision, medical imaging, desktop publishing, document processing, and food industry and agriculture needs.
Grayscale images generally work fine with 8 or 16 bits to represent each pixel. Elementary operators on the images are used in a hierarchical manner. There are two types of elementary operators: dilation and erosion. Operators other than these are distance transform, watershed, reconstruction, labeling and area-opening. The SDC Morphology Toolbox is supported on various platforms, such as Win95/98/NT, Linux and Solaris.
Some common conventions are used in this toolbox. All operators of the SDC Morphology Toolbox start with mm. These return a single data structure, and parameters passed are position- and type-dependent. Most functions in the SDC Morphology Toolbox operate in 3-D.
Special thanks to James Fowler for his contribution in QccPack. Thanks also to W. Pearlman of RPI and L. Granda of PrimaComp for their QccPackSPIHT module. And, last but not least, thanks to the Python SIG group for PIL.
Suhas A. Desai works with Tech Mahindra Ltd. He writes on open source and security. In his free time, he volunteers for social causes.