Source Code Scanners for Better Code

They aren't a replacement for manual checks and edits, but tools like Flawfinder, RATS and ITS4 can point you in the right direction.
Using the Tools

Use of these source code scanners is relatively straightforward, as they take for their input one or more source code files and then produce output. Our examples will focus on the file openldap-2.0.11/libraries/libldap/print.c, from the OpenLDAP 2.0.11 source distribution. This has been chosen because it highlights some of the subtle differences in these scanners. The code section which produces hits looks like this:

35-int ldap_log_printf( LDAP *ld, int loglvl, const char *fmt, ... )
37-     char buf[ 1024 ];
38-     va_list ap;
40-     if ( !ldap_log_check( ld, loglvl )) {
41-             return 0;
42-     }
44-     va_start( ap, fmt );
47-     buf[sizeof(buf) - 1] = '\0';
48-     vsnprintf( buf, sizeof(buf)-1, fmt, ap );
50-     vsprintf( buf, fmt, ap ); /* hope it's not too long */
52-     /* use doprnt() */
53-     chokeme = "choke me! I don't have a doprnt manual handy!";

Now, let's process this code piece using each of the three scanners to demonstrate their differences. First, we will look at the output from the RATS tool:

$ rats print.c
print.c:37: High: fixed size local buffer
Extra care should be taken to ensure that character arrays that are allocated
on the stack are used safely.  They are prime targets for buffer overflow
print.c:50: High: vsprintf
Check to be sure that the non-constant format string passed as argument 2 to
this function call does not come from an untrusted source that could have added
formatting characters that the code is not prepared to handle.
print.c:50: High: vsprintf
Check to be sure that the format string passed as argument 2 to this function
call does not come from an untrusted source that could have added formatting
characters that the code is not prepared to handle.  Additionally, the format
string could contain `%s' without precision that could result in a buffer

RATS found three errors with high severity that are worth noting. Notice that it reported the vsprintf() twice, however, with the second report being nearly identical to the first, with an added warning.

Now let's try Flawfinder on the same input file:

$ flawfinder print.c                                                           
Flawfinder version 0.15, (C) 2001 David A. Wheeler.
Number of dangerous functions in C ruleset: 40
Processing print.c
print.c:48 [4] (format) vsnprintf: if format strings can be influenced by an attacker, 
they can be exploited. Use a constant for the format specification. 
print.c:50 [4] (format) vsprintf: Potential format string problem. Make Format 
string constant. 
There are probably other security vulnerabilities as well; review your code!

Flawfinder found two unique problems worth reporting, but it doesn't note the fixed size declaration of "char buf[ 1024 ]" at line 37, which could become a problem (and it does on some platforms).

Lastly, let's use ITS4 on the same input and examine its output:

$ its4 print.c                                                                 
print.c:48:(Urgent) vsnprintf
Non-constant format strings can often be attacked.
Use a constant format string.
print.c:50:(Urgent) vsprintf
Non-constant format strings can often be attacked.
Use a constant format string.

Again, the same results as Flawfinder, tagging string format caveats. Notice that in each case the output is formatted differently, different functions are tagged by all three tools (this limited example doesn't highlight all of the differences), and none of them handle the conditional inclusion of safe or unsafe functions.

Interpreting the results is not always as easy as it may appear. For example, input sanitization can be a tricky, as is safely opening a file to avoid a symlink attack. Even copying strings can be tricky, as subtle issues like NUL termination, length and memory allocation all play a part. Astute programmers will still be required (see Resources).

Typically, the scanners are designed to be run over the source code several times during development, each time fixing or investigating the major problems. When coding, it's easy, initially, to forget to use a more secure function like strncpy(), and these tools help reinforce improved habits. When examining outside sources of code, scanners help highlight areas of code that may be problematic. In either case, these scanners help show you where to focus your attentions, and they cover many of the basic, common coding errors that lead to faulty code or, worse, security issues.

All of these tools are quite fast, operating on about 100,000 lines of input in under one second. Simply put, you'll spend more time trying to figure out the meaning of the output than you will running the checker itself.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Need more intelligent tools

Anonymous's picture

The problem with scanner type tools is they provide very little intelligent filtering and flood the user with many false positives; invariably users look at the first 10 results and give up.

Re: Source Code Scanners for Better Code

Anonymous's picture

You have a good overview of the 3 source code scanners, are these the commonly used one's, are there any other.
I had a quick question on source code scanners, Can this scanners be used to scan code written for different platforms?(i.e. me running source code scanner on linux, can i scan some piece of code written to run on Windows, Unix)


Re: Source Code Scanners for Better Code

jnazario's picture

sorry about the bad grammar in some places, i need to be a bit more careful with that. :) anyhow, hope you enjoy the piece.

One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix