System Calls

Functions in the Linux kernel can be called by user programs. Howerver, it takes a bit of preparation. In this column, Michael guides you through the process step by step, explaining why as well as what.
Invoking Your System Call

Now you can call your new function from user code, but how? You can't simply declare extern int sys_name(int arg); and link. Instead, you have to #include <unistd.h> and use the appropriate syscallX() macro, where X is the number of arguments the system call takes. The syscallX() macros are actually defined in include/asm/unistd.h, which gets included by <unistd.h> automatically.

If your system call is declared as

asmlinkage int sys_name(void);

the syscall0() invocation is quite easy:

_syscall0(int, name)

(notice the leading underscore). This gets converted by the C preprocessor into

int name(void)
{
long __res;
__asm__ volatile ("int $0x80"
        : "=a" (__res)
        : "0" (__NR_name));
if (__res >= 0)
        return (int) __res;
errno = -__res;
return -1;
}

on Linux/i86. Because it uses assembly, it will be different on other architectures. Fortunately, it doesn't really matter. The important point is that it creates a function called name which generates an interrupt (remember the “white lie” about interrupts? System calls are interrupts, too) which calls the system call, and then returns the result if the answer is positive, and returns -1 if the answer is negative (has the high-order bit set), setting errno to the non-negative error number.

If your function has two arguments:

asmlinkage int sys_name(int num, struct foo *bar);

you would instead use this:

_syscall2(int, name, int, num, struct foo *, bar)

which would expand to:

int name(int num, struct foo * bar)
{
long __res;
__asm__ volatile ("int $0x80"
        : "=a" (__res)
        : "0" (__NR_name),
          "b" ((long)(num)), "c" ((long)(bar)));
if (__res >= 0)
        return (int) __res;
errno = -__res;
return -1;
}

Notice the unusual way of specifying the arguments to the macro, where the return type and the name of the function are followed by separate arguments for the type and name of each of the system call's arguments. Figuring out how to specify system calls with 1, 3, 4, or 5 arguments is left as an exercise for the reader.

For the curious: there is one other way that system calls may be called on Linux/i86. iBCS2-based programs call system calls with an lcall 7,0 instruction instead of an int $0x80 instruction. The lcall instruction takes slightly longer than the int instruction, which is why it is the default system call mechanism on Linux, but both are supported. The lcall instruction isn't exactly an interrupt, although it acts much like one; technically it is a “call gate”. So my “white lie” isn't really a lie after all.

Michael K. Johnson is the Editor of Linux Journal, and pretends to be a Linux guru in his spare time. He can be reached via e-mail as johnsonm@ssc.com.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix