The sysctl Interface

A look at the sysctl system call that gives you the ability to fine tune kernel parameters.
Using the System Call

Even though the /proc file system is a great resource, it is not always available in the kernel. Since it's not vital to system operation, there are times when you choose to leave it out of the kernel image or simply don't mount it. For example, when building an embedded system, saving 40 to 50KB can be advantageous. Also, if you are concerned about security, you may decide to hide system information by leaving /proc unmounted.

The system call interface to kernel tuning, namely sysctl, is an alternative way to peek into configurable parameters and modify them. One advantage of sysctl is that it's faster, as no fork/exec is involved (i.e., no external programs are spawned) nor is any directory lookup. However, unless you run an ancient platform, the performance savings are irrelevant.

To use the system call in a C program, the header file sys/sysctl.h must be included; it declares the sysctl function as:

int sysctl (int *name, int nlen, void *oldval,
  size_t *oldlenp, void *newval, size_t newlen);

If your standard library is not up to date, the sysctl function will neither be prototyped in the headers nor defined in the library. I don't know exactly when the library function was first introduced, but I do know libc-5.0 does not have it, while libc-5.3 does. If you have an old library you must invoke the system call directly, using code such as:

#include <linux/unistd.h>
#include <linux/sysctl.h>
/* now "_sysctl(struct __sysctl_args *args)"
   can be called */
_syscall1(int, _sysctl, struct __sysctl_args *,
        args);
The system call gets a single argument instead of six of them, and the mismatch in the prototypes is solved by prepending an underscore to the name of the system call. Therefore, the system call is _sysctl and gets one argument, while the library function is sysctl and gets six arguments. The sample code introduced in this article uses the library function.

The six arguments of the sysctl library function have the following meaning:

  1. name points to an array of integers: each of the integer values identifies a sysctl item, either a directory or a leaf node file. The symbolic names for such values are defined in the file linux/sysctl.h.

  2. nlen states how many integer numbers are listed in the array name. To reach a particular entry you need to specify the path through the subdirectories, so you need to specify the length of this path.

  3. oldval is a pointer to a data buffer where the old value of the sysctl item must be stored. If it is NULL, the system call won't return values to user space.

  4. oldlenp points to an integer number stating the length of the oldval buffer. The system call changes the value to reflect how much data has been written, which can be less than the buffer length.

  5. newval points to a data buffer hosting replacement data. The kernel will read this buffer to change the sysctl entry being acted upon. If it is NULL, the kernel value is not changed.

  6. newlen is the length of newval. The kernel will read no more than newlen bytes from newval.

Now, let's write some C code to access the four parameters contained in /proc/sys/kernel/printk. The numeric name of the file is KERN_PRINTK, within the directory CTL_KERN/ (both symbols are defined in linux/sysctl.h). The code shown in Listing 1, pkparms.c, is the complete program to access these values.

Changing sysctl values is similar to reading them—just use newval and newlen. A program similar to pkparms.c can be used to change the console log level, the first number in kernel/printk. The program is called setlevel.c, and the code at its core looks like:

int newval[1];
int newlen = sizeof(newval);
/* assign newval[0] */
error = sysctl (name, namelen, NULL /* oldval */,
         0 /* len */, newval, newlen);

The program overwrites only the first sizeof(int) bytes of the kernel entry, which is exactly what we want.

Please remember that the printk parameters are not exported to sysctl in version 2.0 of the kernel. The programs won't compile under 2.0 due to the missing KERN_PRINTK symbol; also, if you compile either of them against later versions and then run under 2.0, you'll get an error when invoking sysctl.

The source files for pkparms.c, setlevel.c and hname.c (which will be introduced in a while) are in the 2365.tgz1 file.

A simple run of the two programs introduced above looks like the following:

# ./pkparms
len is 16 bytes
6       4       1       7
# cat /proc/sys/kernel/printk
6       4       1       7
# ./setlevel 8
# ./pkparms
len is 16 bytes
8       4       1       7

If you run kernel 2.0, don't despair—the files acting on kernel/printk are just samples, and the same code can be used to access any sysctl item available in 2.0 kernels with minimal modifications.

On the same ftp site you'll also find hname.c, a bare-bones hostname command based on sysctl. The source works with the 2.0 kernels and demonstrates how to invoke the system call with no library support, since my Linux-2.0 runs on a libc-5.0-based PC.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Call sysctl from kernel ?

Anonymous's picture

I miss a description how to make a sysctl-call from the kernel and not from userspace i.E. for the case you want to communicate with another kernel module where exported variables are unknown or you just want to use the existing interface instead of creating another one.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix