klogd: The Kernel Logging Dæmon

klogd reads kernel log messages and helps process and send those messages to the appropriate files, sockets or users. This month we discuss memory address resolution and how to modify klogd's default behavior using command-line switches.
Command-Line Switches

-c Default console logging level. The kernel writes log messages not only to the kernel message buffer, but also to the system console (usually /dev/console). The default level for the kernel is 7, which means that messages of a value lower than 7 (higher priority) are written to the console. Often, you will want to change this once klogd is running, so the console isn't always scrolling through a lot of low-priority messages. The klogd/syslogd combination gives you quite a bit more control over your kernel messages than simply dumping them to a screen. You can specify a number n here (ex. -c 4) where messages of a value lower than but not equal to n will go to the console. Note that klogd doesn't route messages to the console itself. It merely provides this interface to change the kernel's setting of the console logging level. Keep in mind that lower values of n are higher priority messages.

-d Debugging mode. This generates lots of output on stderr. Give it a try if you're curious, although I do not recommend running this way for any length of time.

-f Log messages to file. This switch allows you to bypass the syslogd interface and log kernel messages directly to a file. You lose all of syslogd's ability to separate messages by facility and priority, to route a message to multiple destinations, and to route to pipes, sockets and users. It has obvious value, however, if for some reason you aren't running syslogd! (Ex. -f /var/log/kernel.log)

-i, -I Signal the currently running klogd. We'll go over these two switches (they are distinct!) in the section on memory address resolution.

-n Do not auto-background. There are three ways you might run a dæmon: by command at the console, by startup script, or directly with the System V init model (/etc/inittab). When you run with init, you don't want the process to “fork and die” (which is how a *nix process puts itself in the background; see chapter 2.6 of W. Richard Stevens' excellent book UNIX Network Programming if none of this makes any sense) as you would in the other two cases. Generally, this need not concern you if klogd is already running on your box.

-o One-shot mode. When started with this option, klogd will read all the messages presently in the kernel log buffer, and then it will exit.

-p Paranoia mode. This changes when klogd loads kernel symbol data. We'll cover this in more detail in the section on memory address resolution.

-s Force system call mode. Normally, klogd checks at startup for the existence of the /proc/kmsg file. If it is there, this is opened as the place to read kernel messages. If it is not there, klogd will poll the kernel through a system call for kernel messages. The /proc/kmsg is favored because it has lower overhead, especially when there are no kernel messages (which is a common case). You can override the preference for the /proc/kmsg interface and force klogd to use the system call instead with this switch.

-k Kernel symbol file. See the section on memory address resolution.

-v Print version and exit. This document is based on klogd 1.3-3.

-x Do not resolve addresses. See the section on memory address resolution for more information.

Memory Address Resolution

(The following discussion presumes Linux running on an x86 processor. I would imagine other processors are similar, but I have not examined the code for them, so I'm not prepared to state that the following holds true for those processors.)

Let's begin by noting that real protection exceptions resulting in kernel logs are very rare events. Most protection faults occur in user-space code. User-space protection faults result in a program termination and core file dump. You can use the core file and your favorite debugger to post-mortem the application. These events hardly bother the Linux kernel, which merrily goes on handling all the other applications in the system.

The faults we are talking about here are processor exceptions that happen in kernel code. These are so rare that I have seen only five since I started using Linux in 1993. Three of them occurred when I was using the “TAMU” Linux release from Texas A&M University. We're talking pre-0.99 Linux. I think that was to be expected. The next occurred when I had a dying hard drive and my swap partition was the defective area. The fifth and last occurred when I had an overheating CPU in my laptop. Since 1994, I haven't seen one for any reason, excepting a hardware failure.

That said, they do happen. Some never-before-used combination of hardware leads to a combination of kernel code never previously run; or perhaps you are a daring soul and you are running a development kernel. Whatever the reason, sometimes good code goes bad. The good news is Linux is an open-source OS. You can fix the bug. Or if not, you can post a bug report that goes directly to the people who can fix the bug. Try that with Windows!

When a protection fault occurs, Linux dumps out a dump of the processor state, including all the registers and the last several entries of the system stack. The latter is critical for finding the source of the problem. Trouble is, the raw dump consists entirely of memory addresses. Since Linux is an open-source system and since many installations have been custom compiled, the likelihood that these addresses will help anyone at a support desk to figure out the problem is small indeed.

Luckily, if you built your kernel in a normal way, there is a file called System.map installed with your kernel (probably in /boot). This maps code and symbols to physical addresses. The klogd dæmon reads this file. This takes care of all the “compiled-in” kernel code, but since the 2.0.x kernel series, Linux has supported kernel modules, which are dynamically loaded kernel-code modules. These could be at any address, depending on which are loaded at a given moment and in what order.

At program start, or in response to a signal, klogd will query the kernel for a list of modules and their load addresses. Kernel modules may register individual function or identifier addresses with the kernel when they are loaded. The klogd dæmon will use this information to report addresses in a fault dump. It is important to note that module addresses from klogd can be out of date! If modules are loaded or unloaded after klogd is initialized, then these module/address resolutions will be incorrect. Your distribution may take care of this for you by providing scripted utilities to refresh klogd automatically. If it does not, then some of the switches we skipped over earlier come into play to help you keep the memory map up to date.

The -i switch tells klogd to reload the module symbols. The -I tells klogd to reload the System.map file. The -p switch enables “paranoia” mode. What this does is cause klogd to attempt to reload the module symbols whenever it sees “Oops” in the kernel message stream. Protection faults have this string in them. I personally consider this kludgy, and I don't use it. Also, if there has been a protection fault, it is possible that the kernel is about to halt or the memory map may be in a corrupt state. It is available if you want it. The -k option allows you to specify the file that contains the kernel symbol information. See the section on multiple kernels below. The -x switch tells klogd to not read kernel and module symbols and simply to dump the protection fault messages untranslated.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix