Linux System Calls
This article aims to give the reader, either a kernel novice or a seasoned programmer, a better understanding of the dynamics of system calls in Linux. Wherever code sections are mentioned, I refer to the 2.3.52 (soon to be 2.4) series of kernels unless otherwise noted.
The most widespread CPU architecture is the IA32, a.k.a. x86, which is the architecture of the 386, 486, the Pentiums I, Pro, II and III, AMD's competing K6 and Athlon lines, plus CPUs from others such as VIA/Cyrix and Integrated Device Technologies. Because it is the most widespread, it will be taken as the illustrative example here. First, I will cover the mechanisms provided by the IA32 type of CPU for handling system calls, and then show how Linux uses those mechanisms. To review a few broad terms:
A kernel is the operating system software running in protected mode and having access to the hardware's privileged registers. The kernel is not a separate process running on the system. It is the guts of the operating system, which controls the scheduling of processes to achieve multitasking, and provides a set of routines, constantly in memory, to which every user-space process has access.
Some operating systems employ a microkernel architecture, wherein device drivers and other code are loaded and executed on demand and are not necessarily always in memory.
A monolithic architecture is more common among UNIX implementations; it is the design employed by classic designs such as BSD.
The Linux kernel is mostly a monolithic kernel: i.e., all device drivers are part of the kernel proper. Unlike BSD, a Linux kernel's device drivers can be “loadable”, i.e., they can be loaded and unloaded from memory through user commands.
Basically, multitasking is accomplished in this way: the kernel switches control between processes rapidly, using the clock interrupt (and other means) to trigger a switch from one process to another. When a hardware device issues an interrupt, the interrupt handler is found within the kernel. When a process takes an action that requires it to wait for results, the kernel steps in and puts the process into an appropriate sleeping or waiting state and schedules another process in its place.
Besides multitasking, the kernel also contains the routines which implement the interface between user programs and hardware devices, virtual memory, file management and many other aspects of the system.
Kernel routines to achieve all of the above can be called from user-space code in a number of ways. One direct method to utilize the kernel is for a process to execute a system call. There are 116 system calls; documentation for these can be found in the man pages.
A system call is a request by a running task to the kernel to provide some sort of service on its behalf. In general, the kernel services invoked by system calls comprise an abstraction layer between hardware and user-space programs, allowing a programmer to implement an operating environment without having to tailor his program(s) too specifically to one single brand or precise specific combination of system hardware components. System calls also serve this generalization function across programming languages; e.g., the read system call will read data from a file descriptor. To the programmer, this looks like another C function, but in actuality, the code for read is contained within the kernel.
The IA32 CPU recognizes two classes of events needing special processor attention: interrupts and exceptions. Both cause a forced context switch to a new procedure or task.
Interrupts can occur at unexpected times during the execution of a program and are used to respond to signals; they are signals that processor attention is needed from hardware. When a hardware device issues an interrupt, the interrupt handler is found within the kernel. Next month, we will discuss interrupts in more detail.
Two sources of interrupts are recognized by the IA32: maskable interrupts, for which vectors are determined by the hardware, and non-maskable interrupts (NMI Interrupts, or NMIs).
Exceptions are either processor-detected or issued (thrown) from software. When a procedure or method encounters an abnormal condition (an exception condition) it can't handle, it may throw an exception. Exceptions of either type are caught by handler routines (_exception handlers_) positioned along the thread's procedure or method invocation stack. This may be the calling procedure or method, or if that doesn't include code to handle the exception condition, its calling procedure or method and so on. If one of the threads of your program throws an exception that isn't caught by any procedure (or method), then that thread will expire.
An exception tells a calling procedure that an abnormal (though not necessarily rare) condition has occurred, e.g., a method was invoked with an invalid argument. When you throw an exception, you are performing a kind of structured “go to” from the place in your program where the abnormal condition was detected to a place where it can be handled. Exception handlers should be stationed at program-module levels in accordance with how general a range of errors each is capable of handling in such a way that as few exception handlers as possible will cover as wide a variety of exceptions as are going to be encountered in field application of your programs.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Speed Up Your Web Site with Varnish | Jun 19, 2013 |
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Non-Linux FOSS: libnotify, OS X Style
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Web & UI Developer (JavaScript & j Query)
- Reply to comment | Linux Journal
2 hours 10 min ago - Yeah, user namespaces are
3 hours 27 min ago - Cari Uang
6 hours 58 min ago - user namespaces
9 hours 52 min ago - yea
10 hours 17 min ago - One advantage with VMs
12 hours 46 min ago - about info
13 hours 19 min ago - info
13 hours 20 min ago - info
13 hours 21 min ago - info
13 hours 23 min ago




Comments
system call
Hi,
Just wanted to know if I want to printk the parameters used in a systemcall how do I go about it? To do so I am trying to access the %eax, %ebx and the other registers used to store the parameters and printk the parameters but not sure how to go about it. I am writing a loadable kernel module to do so. Any idea how to do it?
Thanks in Advance
fork() implementation
can anyone tell me where assembly routines implementing sysytem call can be found.
Very good article overall.
Very good article overall. But I don't know what software exceptions have to do with interruptions and hardware exceptions; I am pretty sure they are totally unrelated. I think it would be better to focus in hardware exceptions, like arithmetic ones, whose handlers are placed in the vector table.
Much of the content in this a
Much of the content in this article was taken directly from "How System Calls Work on Linux/i86" by Michael K. Johnson and Stanley Scalsky which is located at this URL:
http://www.tldp.org/LDP/khg/HyperNews/get/syscall/syscall86.html
Copyright (C) 1993, 1996 Michael K. Johnson, johnsonm@redhat.com.
Copyright (C) 1993 Stanley Scalsky
No mention of credit was given by Moshe Bar to the original authors. At the very least this is plagiarism and a blatant copyright violation.
You must not have read the original article, or did you?
The two articles are not in the least identical and the article you refer to is not in the least as exhaustive as this article. I would say that Moshe did a good job here at both taking available information by the kernel hackers and, second, providing even more indepth information on how it is actually done.
who cares....
who cares....