Kernel Korner: The Linux Test Project

Finding 500 bugs in 50 different kernel versions is the fruit of this thorough Linux testing and code coverage project.

The Linux Test Project (LTP) was developed to improve the Linux kernel by bringing automated testing to kernel design. Prior to the LTP, no formal testing environment was available to Linux developers. Although most developers unit-tested the effects of their own enhancements and patches, systematic integration testing did not exist. The LTP's primary goal is to provide a test suite to the Open Source community that helps to validate the reliability, robustness and stability of the Linux kernel. The suite tests kernel function and regression, with and without stress. The LTP is not a performance benchmark, but benchmarks often are used to drive the kernel during testing.

The LTP began as 100 test programs developed by SGI. Now, through the joint efforts of SGI, IBM, OSDL, Bull, Wipro Technologies and individual Linux developers, the LTP contains over 2,500 test programs, also called test cases, and a number of automation tools. The LTP supports multiple architectures, including x86, IA32/64, PPC32/64, and 32- and 64-bit s/390.

Although other test suites and projects exist, the LTP includes an environment for defining new tests, integrating existing benchmarks and analyzing test results. The Software Testing Automation Framework (STAF/STAX) is an open-source system that allows you to plan, distribute, execute and collect test results from a large pool of multiplatform test hosts. STAF/STAX also provides a powerful GUI-monitoring application that allows you to interact with and monitor the progress of your jobs. Test-coverage visualization tools let you see how much of a test's source code is executed by the kernel.

The IBM Linux Technology Center (LTC) has played a key role in using the LTP to uncover defects in the Linux kernel. Using the LTP, the LTC has tested more than 50 new kernel versions and found more than 500 defects. As covered in Linda Scott's whitepaper (see the on-line Resources), a typical kernel test cycle uses the LTP for focus testing to isolate and validate Linux component and application stability. This includes regression testing on new kernels to ensure they meet the functionality of previous kernels. Integration testing then validates component interaction, driven by macro-benchmark workloads. Finally, reliability and stress testing validate systemic robustness with extended duration tests (96 hours to 30 days).

The remainder of this article describes how to download and run the LTP test suite using the automation tools. We also discuss some LTP tools that can be used to help improve kernel development and testing.

The Test Suite

The tests cover a wide range of kernel functions, including system calls, networking and filesystem functionality. The basic building block of the test suite is a test program that performs a sequence of actions and verifies the outcome. The test results usually are restricted to PASS or FAIL. Together, all the test programs and tools make up the LTP package.

The LTP is a GPL package and is available from SourceForge.net. A stable version of the LTP test suite source, ltp-yyyymmdd.tgz is released monthly. As of this writing, the latest version is ltp-20040405. After downloading the package, extract and install as follows:

tar zxf ltp-20040405.tgz
cd ltp-20040405
make
make install

You need root access to perform that last step and also to run the test suite. The test suite also is available in binary and source RPM format. For those of you who like living on the edge, development snapshots can be downloaded through anonymous CVS (see Resources).

Executing the Test Suite

Once installed, a number of options are available for running the LTP test suite. The most popular method is to use the runalltests.sh script, which executes about 800 of the original tests. The tests not included in runall are destructive, require monitoring or for some other reason cannot be automated. The runall script has a default behavior to run a single iteration of the test suite and produce verbose screen output. This output can be omitted with the quiet option (-q). As a simple introduction, we ignore the screen information for now and use the -l logfile_name and -p options to generate human-readable log results.

The test cases are executed by the test driver called Pan. Pan, included in the LTP package, is a lightweight driver used to run and clean up test programs. The runalltests script calls Pan to execute a set of test cases or a single test case. You can execute a set of test cases by providing runalltests with a -f scenario file. A scenario file is a simple ASCII text file that contains two columns. The first column has the name of a test case, and the second column has the command to be run. Comments start with a pound sign. For example:

# Testcase to test mmap function of the kernel
testcase1       mmap3 -l 100 -n 50

# Testcase to stress the kernel scheduler
testcase2       sched_stress.sh

The test driver uses the exit value of the test case to decide success or failure of a test. If the test case exits with a non-zero value, Pan records this as FAIL. If the test case exits with a value zero, the driver records it as PASS.

The simplest use of the test suite is to run it on your system to ensure that there are no failures:

runalltests.sh -l log -p -o output

For known failures, the LTP package includes an explanation and pointers to places for more information. Below is the partial log file from running ltp-20040506 on a 2.6.3 kernel:

Test Start Time: Mon May 17 14:20:45 2004
-----------------------------------------
Testcase                       Result     Exit Value
--------                       ------     ----------
abort01                        PASS       0
accept01                       PASS       0
access01                       PASS       0
...
rwtest01                       PASS       0
rwtest02                       PASS       0
rwtest03                       FAIL       2
rwtest04                       FAIL       2
rwtest05                       PASS       0
iogen01                        PASS       0
...
-----------------------------------------------
Total Tests: 797
Total Failures: 6
Kernel Version: 2.6.3-gcov
Machine Architecture: i686
Hostname: ltp2

In this partial log, 797 tests were run and six failed. rwtest03 and rwtest04 are I/O tests that failed due to mmap running out of resources. This problem has been resolved. The remaining failures, not shown in the log, are described below:

  • setegid01: verify that setegid does not modify the saved gid or real gid—failed because of a bug in glibc 2.3.2.

  • dio18,dio22: I/O testing—failed because of data comparison mismatch.

  • nanosleep02: verify that nanosleep will suspend and return remaining sleep time after receiving signal—failed due to lack of microsecond clock precision.

Writing test programs is fairly straightforward. The test cases are written in ANSI C and BASH and use the LTP Application Program Interfaces (APIs) provided by the LTP library libltp to report test status. Templates are provided that show you how to develop test cases using libltp. The test cases can use the interface to print results messages, break out of testing sequence and report a test status such as PASS or FAIL. Manual pages for using these APIs are provided in the test suite package and also on the LTP Web site. For more on the esoteric uses of LTP and a tutorial on developing tests that can be included in the LTP, see the Iyer and Larson papers in Resources.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix