The Linux Signals Handling Model

Communication is the key to healthy relationships between threads and the kernel; these are the signals they use to communicate.
Signal Description and Default Action

The disposition of a signal can be changed from its default, and a process can arrange to catch a signal and invoke a signal-handling routine of its own or ignore a signal that may not have a default disposition of Ignore. The only exceptions are SIGKILL and SIGSTOP; their default dispositions cannot be changed. The interfaces for defining and changing signal disposition are the signal and sigset libraries and the sigaction system call. Signals can also be blocked, which means the process has temporarily prevented delivery of a signal. Generation of a signal that has been blocked will result in the signal remaining as pending to the process until it is explicitly unblocked or the disposition is changed to Ignore. The sigprocmask system call will set or get a process's signal mask, the bit array inspected by the kernel to determine if a signal is blocked or not. thr_setsigmask and pthread_sigmask are the equivalent interfaces for setting and retrieving the signal mask at the user-threads level.

I mentioned earlier that a signal may originate from several different places for a variety of different reasons. The first three signals listed in Table 1—SIGHUP, SIGINT and SIGQUIT—are generated by a keyboard entry from the controlling terminal (SIGINT and SIGHUP) or if the control terminal becomes disconnected (SIGHUP—use of the nohup command makes processes “immune” from hangups by setting the disposition of SIGHUP to Ignore).

Other terminal I/O-related signals include SIGSTOP, SIGTTIN, SIGTTOU and SIGTSTP. For the signals originating from a keyboard command, the actual key sequence that generates the signals, usually CTRL-C, is defined within the parameters of the terminal session, typically via stty(1) which results in a SIGINT being sent to a process, and has a default disposition of Exit.

User tasks in Linux, created via explicit calls to either thr_create or pthread_create, all have their own signal masks. Linux threads call clone with CLONE_SIGHAND; this shares all signal handlers between threads via sharing the current->sig pointer. Delivered signals are unique to a thread.

In some operating systems, such as Solaris 7, signals generated as a result of a trap (SIGFPE, SIGILL, etc.) are sent to the thread that caused the trap. Asynchronous signals are delivered to the first thread found not blocking the signal. In Linux, it is almost exactly the same. Synchronous signals happening in the context of a given thread are delivered to that thread.

Asynchronous in-kernel signals (e.g., asynchronous network I/O) is delivered to the thread that generated the asynchronous I/O. Explicit user-generated signals get delivered to the right thread as well. However, if CLONE_PID is used, all places that use the PID to deliver a signal will behave in a “weird” way; the signal gets randomly delivered to the first thread in the pidhash. Linux threads don't use CLONE_PID, so there is no such problem if you are using the pthreads.h thread API.

When a signal is sent to a user task, for example, when a user-space program accesses an illegal page, the following happens:

  • page_fault (entry.S) in the low-level page-fault handler.

  • do_page_fault (fault.c) fetches i386-specific parameters of the fault and does basic validation of the memory range involved.

  • handle_mm_fault (memory.c) is generic MM (memory management) code (i386-independent), which gets called only if the memory range (VMA) exists. The MM reads the page table entry and uses the VMA to find out whether the memory access is legal or not.

Listing 1

The case we are interested in now is when the access was illegal (e.g., a write was attempted to a read-only mapping): handle_mm_fault returns 0 to do_page_fault in this case. As you can see from Listing 1, locking of the MM is very finely grained (and it better be this way); the mm->mmap_sem, per-MM semaphore, is used (which typically varies from process to process).

force_sig(SIGBUS,current) is used to “force” the SIGBUS signal on the faulting task. force_sig delivers the signal even if the process has attempted to ignore SIGBUS.

force_sig fills out the signal event structure and queues it into the process's signal queue (current->sigqueue and current->sigqueue_tail). The signal queue holds an indefinite number of queued signals. The semantics of “classic” signals are that follow-up signals are ignored—this is emulated in the signal code kernel/signal.c. “Generic” (or RT) signals can be queued arbitrarily; there are reasonable limits to the length of the signal queue.

The signal is queued, and current-signal is updated. Now comes the tricky part: the kernel returns to user space. Return to user space happens from do_page_fault=>page_fault (entry.S), then the low-level exit code in entry.S is executed in this order:

page_fault=>(called do_page_fault)=>error_code=>
ret_from_exception=>(checks if return to user space)=>
ret_with_reschedule=>(sees that current->signal is nonzero)
=>calls do_signal

Next, do_signal unqueues the signal to be executed. In this case, it's SIGBUS.

Then handle_signal is called with the “unqueued” signal (which can potentially hold extra event information in case of real-time signals/messages).

Next called is setup_frame, where all user-space registers are saved and the kernel stack frame return address is modified to point to the handler of the installed signal handler. A small sequence of code jumper is put on the user stack (obviously, the code first makes sure the user stack is valid) which will return us to kernel space once the signal handler has finished. (See Listing 2.)

Listing 2

Careful: this area is one of the least-understood pieces of the Linux kernel, and for good reason; it is really tough code to read and follow.

The popl %eax ; movl $,%eax ; int $0x80 x86 assembly sequence calls sys_sigret, which later on will restore the kernel stack frame return address to point to the original (faulting) user address.

What is all this magic good for? Well, first the kernel has to guarantee that signal handlers get called properly and the original state is restored. The kernel also has to deal with binary compatibility issues. Linux guarantees that on the IA-32 (Intel x86) architecture, we can run any iBC86-compliant binary code. Speed is also an issue.

Listing 3

Finally, we return to entry.S again, but current-signal is already cleared, so we do not execute do_signal but jump to restore_all as shown in Listing 3. restore.all executes the “iret” that brings us into user space. Suddenly, we are magically executing the signal handler.

Did you get lost yet? No? Here is some more magic. Once the signal handler finishes (it does an assembly “ret” like all well-behaving functions), it will execute the small jumper function we have set up on the user stack. Again we return to the kernel, but now we execute the sys_sigreturn system call, which lives in arch/i386/kernel/signal.c as well. It essentially executes the following code section:

if (restore_sigcontext(regs, &frame->sc, &eax))
     goto badframe;
return eax;

The above code restores the exact user-register contents into the kernel stack frame (including the return address and flags register) and executes a normal ret_from_syscall, bringing us back to the original faulting code. Hopefully the SIGBUS handler has fixed the problem of why we were faulting.

Now, while reading the above description, you might think this is awfully complex and slow. It actually isn't; lmbench reveals that Linux has the fastest signal-handler installation and execution performance by far of any UNIX running:

moon:~/l> ./lat_sig install
Signal handler installation: 1.688 microseconds
moon:~/l> ./lat_sig catch
Signal handler overhead: 3.186 microseconds

Best of all, it scales linearly on SMP:

moon:~/l> ./lat_sig catch & ./lat_sig catch &
Signal handler overhead: 3.264 microseconds
Signal handler overhead: 3.248 microseconds
moon:~/l> ./lat_sig install & ./lat_sig install &
Signal handler installation: 1.721 microseconds
Signal handler installation: 1.689 microseconds

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I have One Question about the

Zafar's picture

I have One Question about the Usage of SIGNAL() and SIGACTION() so which function shall I Use. Is the use of traditional SIGNAL() is a disadvantage.

"Asynchronous signals are

Ling's picture

"Asynchronous signals are delivered to the first thread found not blocking the signal." The only solution I found so far is to create a thread to process all signals, and mask the rest of the threads.

This makes the signal pretty useless on Android, as we don't have control of how threads are created, and the system might use other signals.

I have read some suggestion and working solution in the kernel to deliver the signal to the originated thread, instead of random thread.

I wonder if the latest kernel is supporting this, or if anyone has solution for this.

Thanks!

Signal Handling

Anonymous's picture

Is there a pictorial representation of the signal handling process in Linux to help visualize the switch between the user space and kernel space?

What does 'reliable' mean? What happens if a signal is unreliabl

Ming's picture

"Implementation of correct and reliable signals has been in place for many years now ... reliable signals require the use of the newer sigaction interface" - What is the definition of 'reliable signals'? Are there some signals reliable and some not, in the same implementation? Or are there reliable implementations in which all signals are reliable, and unreliable implementations where all signals are unreliable?
And I have a question that is not answered in all the papers and books I have read so far: what happens if a signal arrives when a previous one (of the same type) is being handled in a signal catching function?

RE: What does 'reliable' mean?

Anonymous's picture

I trust reliable is referred to in the sense that some functions makes the entire signaling handling reliable, e.g. 'sigaction' may block signals and set a new signal handler in one atomic swoop, as to eliminate the potential race condition.
As to your last question, if the signal is not blocked and a new instance is generated in the midst of the processing of the last one, then the signal handler must be re-entrant. IMHO, the best method is to block the signal on entering the signal handler. It will be automatically restored on exit. On old UNIX versions, once a signal handler is invoked, the signal handling is reverted to default. If a new signal instance comes along, it will be treated the default way, most often terminating the process.

Table 1 HTML lacks tag

Anonymous's picture

Table 1 HTMl page lacks start tag for the table, making the page unrenderable, even if the article is dated in year 2000.

start tag.

Keith Daniels's picture

Thanks for catching that. I will fix it ASAP

Webmaster - Linux Journal

"I have always wished that my computer would be as easy to use as my telephone.
My wish has come true. I no longer know how to use my telephone."
-- Bjarne Stroustrup

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix