Source Code Scanners for Better Code

They aren't a replacement for manual checks and edits, but tools like Flawfinder, RATS and ITS4 can point you in the right direction.
Using the Tools

Use of these source code scanners is relatively straightforward, as they take for their input one or more source code files and then produce output. Our examples will focus on the file openldap-2.0.11/libraries/libldap/print.c, from the OpenLDAP 2.0.11 source distribution. This has been chosen because it highlights some of the subtle differences in these scanners. The code section which produces hits looks like this:

35-int ldap_log_printf( LDAP *ld, int loglvl, const char *fmt, ... )
36-{
37-     char buf[ 1024 ];
38-     va_list ap;
39-
40-     if ( !ldap_log_check( ld, loglvl )) {
41-             return 0;
42-     }
43-
44-     va_start( ap, fmt );
45-
46-#ifdef HAVE_VSNPRINTF
47-     buf[sizeof(buf) - 1] = '\0';
48-     vsnprintf( buf, sizeof(buf)-1, fmt, ap );
49-#elif HAVE_VSPRINTF
50-     vsprintf( buf, fmt, ap ); /* hope it's not too long */
51-#else
52-     /* use doprnt() */
53-     chokeme = "choke me! I don't have a doprnt manual handy!";
54-#endif

Now, let's process this code piece using each of the three scanners to demonstrate their differences. First, we will look at the output from the RATS tool:

$ rats print.c
print.c:37: High: fixed size local buffer
Extra care should be taken to ensure that character arrays that are allocated
on the stack are used safely.  They are prime targets for buffer overflow
attacks.
print.c:50: High: vsprintf
Check to be sure that the non-constant format string passed as argument 2 to
this function call does not come from an untrusted source that could have added
formatting characters that the code is not prepared to handle.
print.c:50: High: vsprintf
Check to be sure that the format string passed as argument 2 to this function
call does not come from an untrusted source that could have added formatting
characters that the code is not prepared to handle.  Additionally, the format
string could contain `%s' without precision that could result in a buffer
overflow.

RATS found three errors with high severity that are worth noting. Notice that it reported the vsprintf() twice, however, with the second report being nearly identical to the first, with an added warning.

Now let's try Flawfinder on the same input file:

$ flawfinder print.c                                                           
Flawfinder version 0.15, (C) 2001 David A. Wheeler.
Number of dangerous functions in C ruleset: 40
Processing print.c
print.c:48 [4] (format) vsnprintf: if format strings can be influenced by an attacker, 
they can be exploited. Use a constant for the format specification. 
print.c:50 [4] (format) vsprintf: Potential format string problem. Make Format 
string constant. 
There are probably other security vulnerabilities as well; review your code!

Flawfinder found two unique problems worth reporting, but it doesn't note the fixed size declaration of "char buf[ 1024 ]" at line 37, which could become a problem (and it does on some platforms).

Lastly, let's use ITS4 on the same input and examine its output:

$ its4 print.c                                                                 
print.c:48:(Urgent) vsnprintf
Non-constant format strings can often be attacked.
Use a constant format string.
----------------
print.c:50:(Urgent) vsprintf
Non-constant format strings can often be attacked.
Use a constant format string.
----------------

Again, the same results as Flawfinder, tagging string format caveats. Notice that in each case the output is formatted differently, different functions are tagged by all three tools (this limited example doesn't highlight all of the differences), and none of them handle the conditional inclusion of safe or unsafe functions.

Interpreting the results is not always as easy as it may appear. For example, input sanitization can be a tricky, as is safely opening a file to avoid a symlink attack. Even copying strings can be tricky, as subtle issues like NUL termination, length and memory allocation all play a part. Astute programmers will still be required (see Resources).

Typically, the scanners are designed to be run over the source code several times during development, each time fixing or investigating the major problems. When coding, it's easy, initially, to forget to use a more secure function like strncpy(), and these tools help reinforce improved habits. When examining outside sources of code, scanners help highlight areas of code that may be problematic. In either case, these scanners help show you where to focus your attentions, and they cover many of the basic, common coding errors that lead to faulty code or, worse, security issues.

All of these tools are quite fast, operating on about 100,000 lines of input in under one second. Simply put, you'll spend more time trying to figure out the meaning of the output than you will running the checker itself.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Need more intelligent tools

Anonymous's picture

The problem with scanner type tools is they provide very little intelligent filtering and flood the user with many false positives; invariably users look at the first 10 results and give up.

Re: Source Code Scanners for Better Code

Anonymous's picture

You have a good overview of the 3 source code scanners, are these the commonly used one's, are there any other.
I had a quick question on source code scanners, Can this scanners be used to scan code written for different platforms?(i.e. me running source code scanner on linux, can i scan some piece of code written to run on Windows, Unix)

-
Thanks,
Prasad

Re: Source Code Scanners for Better Code

jnazario's picture

sorry about the bad grammar in some places, i need to be a bit more careful with that. :) anyhow, hope you enjoy the piece.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState