The sysctl Interface

A look at the sysctl system call that gives you the ability to fine tune kernel parameters.

The sysctl system call is an interesting feature of the Linux kernel; it is quite unique in the Unix world. The system call exports the ability to fine-tune kernel parameters and is tightly bound to the /proc file system, a simpler, file-based interface that can be used to perform the same tasks available by means of the system call. sysctl appeared in kernel 1.3.57 and has been fully supported ever since. This article explains how to use sysctl with any kernel between 2.0.0 and 2.1.35.

When running Unix kernels, system administrators often need to fine-tune some low-level features according to their specific needs. Usually, system tailoring requires you rebuilding the kernel image and rebooting the computer. These tasks are lengthy ones which require good skills and a little luck to be successfully completed. Linux developers diverged from this approach and chose to implement variable parameters in place of hardwired constants; run-time configuration can be performed by using the sysctl system call or more easily by exploiting the /proc file system. The internals of sysctl are designed not only to read and modify configuration parameters, but also to support a dynamic set of such variables. In other words, the module writer can insert new entries in the sysctl tree and allow run-time configuration of driver features.

The /proc Interface to System Control

Most Linux users are familiar with the /proc file system. In short, the file system can be considered a gateway to kernel internals: its files are entry points to certain kernel information. Such information is usually exchanged in textual form to ease interactive use, although the exchange can involve binary data when required. The typical example of a binary /proc file is /proc/kcore, a core file that represents the current kernel. Thus, you can execute the command:

gdb /usr/src/linux/vmlinux /proc/kcore

and peek into your running kernel. Naturally, gdb on /proc/kcore gives much better results if vmlinux has been compiled using the -g compiler option.

Most of the /proc files are read-only: writing to them has no effect. This applies, for instance, to /proc/interrupts, /proc/ioports, /proc/net/route and all the other information nodes. The directory /proc/sys, on the other hand, behaves differently; it is the root of a file tree related to system control. Each subdirectory in /proc/sys deals with a kernel subsystem like net/ and vm/, while the kernel/ subdirectory is special as it includes kernel-wide parameters, like the file kernel/hostname.

Each sysctl file includes numeric or string values—sometimes a single value, sometimes an array of them. For example, if you go to the /proc/sys directory and give the command:

grep . kernel/*

kernel 2.1.32 returns data similar to the following:

kernel/ctrl-alt-del:0
kernel/domainname:systemy.it
kernel/file-max:1024
kernel/file-nr:128
kernel/hostname:morgana
kernel/inode-max:3072
kernel/inode-nr:384     263
kernel/osrelease:2.1.32
kernel/ostype:Linux
kernel/panic:0nn
kernel/printk:6   4    1   7
kernel/securelevel:0
kernel/version:#9 Mon Apr 7 23:08:18 MET DST 1997
It's worth stressing that reading /proc items with less doesn't work, because they appear as zero-length files to the stat system call, and less checks the attributes of the file before reading it. The inaccuracy of stat is a feature of /proc, rather than a bug. It's a saving in human resources (in writing code), and kernel size (in carrying the code around). stat information is completely irrelevant for most files, as cat, grep and all the other tools work fine. If you really need to use less to look at the contents of a /proc file, you can resort to:
cat
If you want to change system parameters, all you need to do is write the new values to the correct file in /proc/sys. If the file contains an array of values, they will be overwritten in order. Let's look at the kernel/printk file as an example. printk was first introduced in kernel version 2.1.32. The four numbers in /proc/sys/kernel/printk control the “verbosity” level of the printk kernel function. The first number in the array is console_loglevel: kernel messages with priority less than or equal to the specified value will be printed to the system console (i.e., the active virtual console, unless you've changed it). This parameter doesn't affect the operation of klogd, which receives all the messages in any case. The following commands show how to change the log level:
# cat kernel/printk
6       4       1       7
# echo 8 > kernel/printk
# cat kernel/printk
8       4       1       7
A level of 8 corresponds to debug messages, which are not printed on the console by default. The example session shown above changes the default behaviour so that every message, including the debug ones, are printed.

Similarly, you can change the host name by writing the new value to /proc/kernel/hostname—a useful feature if the hostname command is not available.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Call sysctl from kernel ?

Anonymous's picture

I miss a description how to make a sysctl-call from the kernel and not from userspace i.E. for the case you want to communicate with another kernel module where exported variables are unknown or you just want to use the existing interface instead of creating another one.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix