Source Code Scanners for Better Code
Source code scanners specifically designed to look for security flaws are obviously a help. However, there are a number of limitations users of them will have to keep in mind.
First, they will never replace a good manual audit of the code. There are simply too many variables that have not yet been abstracted into an automated scan.
Secondly, it is vital for code authors to understand the functions and libraries they are using and the nuances inherent in them. There will never be a replacement for understanding the source of errors, as these tools only list some of the possible security holes in the code. For example it can't dig into a library to find unsafe functions buried beneath other functions, unless the tool has been explicitly told that the function is unsafe. A similar constraint exists for types of data, such as int, char or longs. If you are using types you define in your code, the scanner most likely doesn't know how to handle them natively.
Lastly, scanners are limited, so far, in the languages they understand. This definitely limits their utility, wide though it may be for most of us. RATS is, so far, the lead in this arena, understanding five programming languages, while the other two are focused on C and C++ parsing.
Some of these limitations come from poor documentation of functions and their obstacles, such as the printf() family. Other errors come from a lack of standardization of how to perform actions securely, such as opening a file. And still others come from the lack of portable secure replacement functions. In this last case, it's probably best to implement the safe functions in your code and call them as needed on platforms which lack them. For example, several versions of snprintf() exist for platforms which lack it, covered under a variety of licenses.
Having the output of these scanners is only the start of securing the code. It's vital to correctly use more secure replacements, such as strncpy() or the right format strings in scrubbed user supplied data. This requires a good understanding of the functions and the code in which they're used. Off-by-one errors, for example, are easy to make when you forget to count NULL termination in your storage allocation.
One of the major problems evident with these scanners is the lack of any preprocessing, so no macros or definitions are expanded, and no external functions available in source form are examined. Therefore, code such as this:
#define p(x) printf ## x char *string1, *string2; /* user supplied */ /* stuff happens ... */ p((string1)); /* insecure! */ p((string2)); /* again! how horrible! */ p(("%s", string1)); /* finally, its correct ... */
may only produce one error in the definition but not in the use of the macro. However, an insecure call is made twice, which will go unnoticed by the scanner. While in this example a macro is used, the same issue applies to unsafe user-defined functions or wrapper functions. This has been the source of several major security holes found over the years, where internally defined functions, which are insecure, are used throughout the code. This additional layer hides the problems in the code. However, in this case, the tools flag the insecure function first, which can then be followed up to fix.
Preprocessing itself is an area of debate for static analysis. Sometimes, flagged code may not be in use on the platform on which development is being tested, in which case it can be ignored for that platform. However, it should still be noted that it may affect some platforms. In the OpenLDAP code example above, older systems without snprintf() would be affected by a potential buffer overflow. Issues surrounding coding adjustments are best addressed by developing a strong understanding of the language and environment for which you are coding and having some secure programming references handy. Several are listed in Resources that are worth investigating.
Some of these pitfalls are traditional problems inherent to static analysis. The most major of these issues can be overcome by preprocessing the input to show the scanner what the compiler would see. This function, however, is still not available on the code examination tools listed here.
Despite some of the mentioned warnings, source code scanners can help improve the state of your code in development or afterwards. It is important to keep these limitations in mind and not presume that everything has been found. The use of two or even all three of these tools is recommended for development teams and basic security audits. Keep in mind that these are tools help assist you in the auditing process, not automate it.
- Linux Kernel Testing and Debugging
- NSA: Linux Journal is an "extremist forum" and its readers get flagged for extra surveillance
- Tails above the Rest, Part III
- Wanted: Your Embedded Linux Projects
- Numerical Python
- RSS Feeds
- Dolphins in the NSA Dragnet
- Are you an extremist?
- The 101 Uses of OpenSSH: Part I
- Tails above the Rest: the Installation