Username/Email:  Password: 
TwitterFacebookFlickrRSS

Linux-Based USPS Mail Sorting System

Can Linux read your handwriting? Embedded Linux takes addresses to barcodes.

The United States Postal Service has over 900 computer systems, each consisting of eight Linux machines, that are used to convert mail piece images into the destination address text. Seven are dual Pentium Pro 200s called slaves, and one is a single Pentium Pro that is the master. The master is a typical PC with a monitor, keyboard, mouse, floppy, hard disk and a CD-ROM drive. It has two Ethernet cards and one image capture board. (See the Sidebar for complete hardware/software information.) The other seven computers, called slaves, run the optical character recognition (OCR) algorithms. They only have a hard disk and an Ethernet card. This recognition system is shown in Figure 1, which also shows the network setup.

Figure 1. Network Setup

The mail sorter moves 12 mail pieces per second past a camera that captures the image. While the image is being recognized, the mail piece travels in a circuitous route and waits for the OCR results before it passes by a barcode printer. The camera has two output cables that send two pixels for each clock pulse, which cycles at 16MHz. One cable delivers a binary image, and the other has a grayscale image of 8 bits per pixel. A custom PCI image capture board resides in the master. This board can take either the grayscale or the binary image. The grayscale image arrives at 32MBps, which is too fast for the computer to do much with, so it isn't used in production. Instead, the binary image is used. The board packages the binary pixels into eight pixels per byte, then groups them into sets of four before putting them into the PC's memory using direct memory access (DMA).

The master computer then compresses the binary image and sends it to one of the seven slaves. Each slave runs two recognition processes, one for each CPU. When the slave process determines the destination address text, it returns it to the master. The master then forwards the results to another computer, which turns the text into a barcode for printing on the mail piece. Once the barcode is on the mail piece, the mail piece can be sorted over and over again without having to re-read the text.

Image Capture Board

The image capture board uses an AMCC 5933 chip for the PCI logic, as shown in Figure 2. To that we added a FIFO and some PLAs. The PLAs pack the binary pixels into bytes, and they order both the grayscale and the binary bytes into longs. This process is diagrammed in Figure 3.

Figure 2. Image Capture Board

Figure 3. Block Diagram of Image Capture Board

At first, the device driver was designed to return the data via a call to read, with the thought that it would be like a normal file read. Fortunately we realized the silliness of trying to make a device driver for a custom image capture board compatible with common UNIX programs. On startup, the device driver allocates a big chunk of contiguous kernel memory. The AMCC is told to ``DMA'' the incoming data into memory. The user-space program requests that memory be mapped into its process space so that it can read from it. If we had returned the data via a call to read, there would be an extra copy of the data from the kernel buffer to the user-space buffer, which would have substantially increased processing overhead.

Whenever an end-of-mail-piece interrupt happens, the driver sticks the size of the last image onto the tail of a list of image start positions and sizes. It then checks to see if it can fit another image into the buffer. If it can't, it will reset the DMA to start at the top of the buffer. There is a physical gap between mail pieces on the scanner, so after the interrupt, there won't be any data for a moment. That moment is easily long enough for the interrupt to point the DMA back to the top. The start position of the new mail piece is put onto the tail of the list, and any sleeping user processes are awakened. The user process asks the driver for the position and size of the oldest mail piece. The driver returns the start and length from the head of the list of images. The user process looks like Listing 1.

Listing 1. Buffer Management for Mail Piece Images

By using mmap, the image data is not copied but is popped into memory by the AMCC DMA. Then the user program compresses it from that very location.

It was difficult for some of the developers to give up normal practices. If this was an image capture board for the general market, the driver would need to be more efficient with memory use, and the API for using it would need to be compatible with other image capture APIs. For a while we tried to achieve these goals, even though this driver wasn't for the general market. For example, we agonized about how to deal with buffer overflow. When the end-of-mail-piece interrupt occurs, the driver needs to calculate whether the oldest image was overwritten. If it was overwritten, it will need to tell the user process. From my description above, the driver will pop the last mail piece start position and size off its internal list and return that to the user process. The user process might take a long time to create the compressed data from that image, which means the DMA could overwrite that image. Therefore, the driver won't have a record of the start and size of that piece, so it won't be able to know for sure if the image was overwritten. You can see there is no code in order to lock the image buffer while the user process is using it. Fortunately, we realized there is no point in adding code and making it bulletproof, when all that was needed was a sufficiently large image buffer and a very conservative overflow flag. If the DMA ever gets within three maximum-sized mail pieces of the oldest image, the flag is raised. This flagging may be premature, but if the user process can't keep up, something is really wrong. This is an embedded system, so it doesn't make any difference if we hog lots of contiguous kernel memory and return errors prematurely. In other words, this error condition cannot happen in production. We must catch any occurrences of it in development and fix it so it won't happen later.

______________________