The Linux Signals Handling Model
The disposition of a signal can be changed from its default, and a process can arrange to catch a signal and invoke a signal-handling routine of its own or ignore a signal that may not have a default disposition of Ignore. The only exceptions are SIGKILL and SIGSTOP; their default dispositions cannot be changed. The interfaces for defining and changing signal disposition are the signal and sigset libraries and the sigaction system call. Signals can also be blocked, which means the process has temporarily prevented delivery of a signal. Generation of a signal that has been blocked will result in the signal remaining as pending to the process until it is explicitly unblocked or the disposition is changed to Ignore. The sigprocmask system call will set or get a process's signal mask, the bit array inspected by the kernel to determine if a signal is blocked or not. thr_setsigmask and pthread_sigmask are the equivalent interfaces for setting and retrieving the signal mask at the user-threads level.
I mentioned earlier that a signal may originate from several different places for a variety of different reasons. The first three signals listed in Table 1—SIGHUP, SIGINT and SIGQUIT—are generated by a keyboard entry from the controlling terminal (SIGINT and SIGHUP) or if the control terminal becomes disconnected (SIGHUP—use of the nohup command makes processes “immune” from hangups by setting the disposition of SIGHUP to Ignore).
Other terminal I/O-related signals include SIGSTOP, SIGTTIN, SIGTTOU and SIGTSTP. For the signals originating from a keyboard command, the actual key sequence that generates the signals, usually CTRL-C, is defined within the parameters of the terminal session, typically via stty(1) which results in a SIGINT being sent to a process, and has a default disposition of Exit.
User tasks in Linux, created via explicit calls to either thr_create or pthread_create, all have their own signal masks. Linux threads call clone with CLONE_SIGHAND; this shares all signal handlers between threads via sharing the current->sig pointer. Delivered signals are unique to a thread.
In some operating systems, such as Solaris 7, signals generated as a result of a trap (SIGFPE, SIGILL, etc.) are sent to the thread that caused the trap. Asynchronous signals are delivered to the first thread found not blocking the signal. In Linux, it is almost exactly the same. Synchronous signals happening in the context of a given thread are delivered to that thread.
Asynchronous in-kernel signals (e.g., asynchronous network I/O) is delivered to the thread that generated the asynchronous I/O. Explicit user-generated signals get delivered to the right thread as well. However, if CLONE_PID is used, all places that use the PID to deliver a signal will behave in a “weird” way; the signal gets randomly delivered to the first thread in the pidhash. Linux threads don't use CLONE_PID, so there is no such problem if you are using the pthreads.h thread API.
When a signal is sent to a user task, for example, when a user-space program accesses an illegal page, the following happens:
page_fault (entry.S) in the low-level page-fault handler.
do_page_fault (fault.c) fetches i386-specific parameters of the fault and does basic validation of the memory range involved.
handle_mm_fault (memory.c) is generic MM (memory management) code (i386-independent), which gets called only if the memory range (VMA) exists. The MM reads the page table entry and uses the VMA to find out whether the memory access is legal or not.
The case we are interested in now is when the access was illegal (e.g., a write was attempted to a read-only mapping): handle_mm_fault returns 0 to do_page_fault in this case. As you can see from Listing 1, locking of the MM is very finely grained (and it better be this way); the mm->mmap_sem, per-MM semaphore, is used (which typically varies from process to process).
force_sig(SIGBUS,current) is used to “force” the SIGBUS signal on the faulting task. force_sig delivers the signal even if the process has attempted to ignore SIGBUS.
force_sig fills out the signal event structure and queues it into the process's signal queue (current->sigqueue and current->sigqueue_tail). The signal queue holds an indefinite number of queued signals. The semantics of “classic” signals are that follow-up signals are ignored—this is emulated in the signal code kernel/signal.c. “Generic” (or RT) signals can be queued arbitrarily; there are reasonable limits to the length of the signal queue.
The signal is queued, and current-signal is updated. Now comes the tricky part: the kernel returns to user space. Return to user space happens from do_page_fault=>page_fault (entry.S), then the low-level exit code in entry.S is executed in this order:
page_fault=>(called do_page_fault)=>error_code=> ret_from_exception=>(checks if return to user space)=> ret_with_reschedule=>(sees that current->signal is nonzero) =>calls do_signal
Next, do_signal unqueues the signal to be executed. In this case, it's SIGBUS.
Then handle_signal is called with the “unqueued” signal (which can potentially hold extra event information in case of real-time signals/messages).
Next called is setup_frame, where all user-space registers are saved and the kernel stack frame return address is modified to point to the handler of the installed signal handler. A small sequence of code jumper is put on the user stack (obviously, the code first makes sure the user stack is valid) which will return us to kernel space once the signal handler has finished. (See Listing 2.)
Careful: this area is one of the least-understood pieces of the Linux kernel, and for good reason; it is really tough code to read and follow.
The popl %eax ; movl $,%eax ; int $0x80 x86 assembly sequence calls sys_sigret, which later on will restore the kernel stack frame return address to point to the original (faulting) user address.
What is all this magic good for? Well, first the kernel has to guarantee that signal handlers get called properly and the original state is restored. The kernel also has to deal with binary compatibility issues. Linux guarantees that on the IA-32 (Intel x86) architecture, we can run any iBC86-compliant binary code. Speed is also an issue.
Finally, we return to entry.S again, but current-signal is already cleared, so we do not execute do_signal but jump to restore_all as shown in Listing 3. restore.all executes the “iret” that brings us into user space. Suddenly, we are magically executing the signal handler.
Did you get lost yet? No? Here is some more magic. Once the signal handler finishes (it does an assembly “ret” like all well-behaving functions), it will execute the small jumper function we have set up on the user stack. Again we return to the kernel, but now we execute the sys_sigreturn system call, which lives in arch/i386/kernel/signal.c as well. It essentially executes the following code section:
if (restore_sigcontext(regs, &frame->sc, &eax)) goto badframe; return eax;
The above code restores the exact user-register contents into the kernel stack frame (including the return address and flags register) and executes a normal ret_from_syscall, bringing us back to the original faulting code. Hopefully the SIGBUS handler has fixed the problem of why we were faulting.
Now, while reading the above description, you might think this is awfully complex and slow. It actually isn't; lmbench reveals that Linux has the fastest signal-handler installation and execution performance by far of any UNIX running:
moon:~/l> ./lat_sig install Signal handler installation: 1.688 microseconds moon:~/l> ./lat_sig catch Signal handler overhead: 3.186 microseconds
Best of all, it scales linearly on SMP:
moon:~/l> ./lat_sig catch & ./lat_sig catch & Signal handler overhead: 3.264 microseconds Signal handler overhead: 3.248 microseconds moon:~/l> ./lat_sig install & ./lat_sig install & Signal handler installation: 1.721 microseconds Signal handler installation: 1.689 microseconds
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Google's SwiftShader Released
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Interview with Patrick Volkerding
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Tech Tip: Really Simple HTTP Server with Python
- Parsing an RSS News Feed with a Bash Script
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide