Memory Access Error Checkers
All C programmers have seen, at least once, the horrible words “Segmentation fault—Core dumped” after a run of their latest creation. Usually, this message is due to errors in the memory management. (As all C programmers know, this language does not care about bounds or limit when accessing memory.) In this article, I plan to compare three products used to track down this kind of error:
Electric Fence 2.0.5
Checker comes in the usual tgz format (gzipped tar file), with a simple installation procedure. Run the “configure” script, then make all the files. The installation went fine for me; I didn't see any problems. One note: you need gcc 2.8.1 to use the latest version of Checker.
Electric Fence is available in binary and source format and requires kernel 1.1.83 or higher.
Mem-Test is available in the tgz format, and it is very simple to build using the provided Makefile.
Electric Fence (EF) is a library—link it to your program, then run it. EF will cause a segmentation fault on the exact line of the wrong instruction (not 100 lines after), so by tracking the program with a debugger, you can get to the root of the problem. Place an inaccessible memory page after (or before, by using the correct option) each area allocated by your program, and EF will cause an immediate error when the program goes out of the bounds.
Mem-Test is another library you can simply link to your object—just remember to include the header file mem_test_user.h before. As we will see, this program is a bit different from the other two, and it detects particular errors. When the program runs, it creates a log where it stores all memory allocation/deallocation. By using a Perl script provided in the package, it will show you the memory leak present in the code. Since Electric Fence doesn't detect this particular error, it can be used in conjunction with Mem-Test.
Checker is also a library and exploits the -fcheck-memory-usage option of gcc. A different compiler is actually used to build your program: checkergcc. It is a stub which calls gcc and compiles the program with its own memory access libraries. Once the program is compiled, you can run it and checker will show you a complete report with the errors it found in your sources. Checker uses a bitmap to store any memory area the program is using. This bitmap will contain the access right of each memory area. For example, an area could be write-only (when the variable is not yet initialized), readable and writable, not accessible and so on. In this way, it will be able to detect the memory access error.
The six pieces of C code we will look at are:
postr.c: this code (Listing 1) performs a read (with printf) of an uninitialized memory area. The lack of the string terminator (\0) will force the printf to read after the malloc area.
prer.c: this piece of code (Listing 2) contains two errors. The printf is accessing a byte before the allocated area (the pointer was decremented), then the free is done with an address not returned by malloc.
postw.c: in this code (Listing 3), the strcpy is writing 12 bytes (with the \0) in a 10-byte area. Moreover, the printf is reading the uninitialized last two bytes.
prew.c: this code (Listing 4) is writing before the allocated memory. The free and the printf will cause an error as in the previous examples.
uninit.c: this code (Listing 5) makes an assignment to a NULL pointer. This is a common error for programmers new to the C language.
unfree.c: in this example (Listing 6), I missed freeing some allocated memory.
To test Checker, I compiled Listing 1 with this command line:
checkergcc -o postr postr.c
All the gcc command-line options can be used with Checker. The compilation went fine, and when I ran postr, I got this output:
From Checker (pid:00411): (ruh) read uninitialized byte(s) in a block. When Reading 5 byte(s) at address 0x0805ce1c, inside the heap (sbrk). 0 byte(s) into a block (start: 0x805ce1c, length: 10, mdesc: 0x0). The block was allocated from: pc=0x08054e2b in chkr_malloc at stubs-malloc.c:52 pc=0x08048812 in main at postr.c:10 pc=0x08054ee1 in this_main at stubs-main.c:14 pc=0x0804875a in *unknown* at *unknown*:0 Stack frames are: pc=0x08054ebf in chkr_stub_printf at stubs-stdio.c:54 pc=0x080489f1 in main at postr.c:17 pc=0x08054ee1 in this_main at stubs-main.c:14 pc=0x0804875a in *unknown* at *unknown*:0 exaChecker executed the program and found the problem—an uninitialized read at line 17 (the printf line). This was caused by the lack of a string terminator in the memory area. At first look, this output seems quite messy, but if you read it carefully, you will find a lot of information: which type of error it found, where the memory was allocated (line 10) and where the problem occurs (line 17).
To compile the program with Mem-Test, you must perform a slight modification to the postr source. Add header file (#include "mem_test_user.h") to wrap the various memory allocation functions and use a modified version. Compile the program with the command:
gcc -o postr postr.c -lmem_test
I added another library (mem_test) to the compilation command. When you run the postr executable, the new library will create a file named MEM_TEST_FILE in which all memory accesses and leaks will be logged. In this particular situation, Mem-Test does not find a problem because it was built to identify only memory leaks.
For Electric Fence, we need to recompile the program, including the reference library:
gcc -g -o postr postr.c -lefence
I added the -g option to include the debugging information in the executable. This is needed because EF will cause a segmentation fault exactly at the buggy line, so you will need to walk through the code to find the exact line causing the problem. This is the output of the executable:
Electric Fence 2.0.5 Copyright (C) 1987-1995 Bruce Perens. exaEF didn't find any problem in the code, so no errors were generated.
EF has four different switches, which can be enabled by setting one of these environment variables: EF_ALIGNMENT, EF_PROTECT_BELOW, EF_PROTECT_FREE or EF_ALLOW_MALLOC_0.
EF_ALIGNMENT sets the alignment for each memory allocation done by malloc (or calloc and realloc). By default, this size is set to sizeof(int), because this is usually the alignment required by the CPU. This could be a problem when you allocate a size that is not a multiple of the word size. Since the inaccessible page must be set to word-aligned address, you have a hole after the allocated memory to the inaccessible page. You can fix this by setting the environment variable to 0; in this way, you will be able to find a single-byte overrun. This will force malloc to return a non-aligned address, but this is not a problem in most cases. In some cases (when you have an odd-size allocation for an object that must be word-aligned), you will get a bus error (SIGBUS). I never saw a SIGBUS error using EF (and I used it in real-life programs); I got this information from the EF documentation.
EF will usually place the unaccessible page after each memory allocation. By setting EF_PROTECT_BELOW to 1, it will place this page before the allocation, so you can check for under-runs.
EF allows you to allocate freed memory. If you think your program is touching free memory, set EF_PROTECT_FREE to 1. EF will not reallocate any freed memory, and any access will be detected.
A malloc call with zero bytes is considered an error. If you need to use such a call, you can tell EF to ignore this error by making EF_ALLOW_MALLOC_0 non-zero.
I set EF_ALIGNMENT to 0 in order to see if the postr error would be detected, but again EF did not see it.