Driving Me Nuts - Things You Never Should Do in the Kernel
Now that you understand the reasoning behind forbidding the ability to read a file from a kernel module, you of course can skip the rest of this article. It does not concern you, as you are off busily converting your kernel module to use sysfs.
Still here? Okay, so you still want to know how to read a file from a kernel module, and no amount of persuading can convince you otherwise. You promise never to try to do this in code that will be submitted for inclusion into the main kernel tree and that I never described how to do this, right?
Actually, reading a file is quite simple, once one minor issue is resolved. A number of the kernel system calls are exported for module use; these system calls start with sys_. So, for the read system call, the function sys_read should be used.
The common approach to reading a file is to try code that looks like the following:
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
/* read the file here */
sys_close(fd);
}
However, when this is tried within a kernel module, the sys_open() call usually returns the error -EFAULT. This causes the author to post the question to a mailing list, which elicits the “don't read a file from the kernel” response described above.
The main thing the author forgot to take into consideration is the kernel expects the pointer passed to the sys_open() function call to be coming from user space. So, it makes a check of the pointer to verify it is in the proper address space in order to try to convert it to a kernel pointer that the rest of the kernel can use. So, when we are trying to pass a kernel pointer to the function, the error -EFAULT occurs.
To handle this address space mismatch, use the functions get_fs() and set_fs(). These functions modify the current process address limits to whatever the caller wants. In the case of sys_open(), we want to tell the kernel that pointers from within the kernel address space are safe, so we call:
set_fs(KERNEL_DS);
The only two valid options for the set_fs() function are KERNEL_DS and USER_DS, roughly standing for kernel data segment and user data segment, respectively.
To determine what the current address limits are before modifying them, call the get_fs() function. Then, when the kernel module is done abusing the kernel API, it can restore the proper address limits.
So, with this knowledge, the proper way to write the above code snippet is:
old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
/* read the file here */
sys_close(fd);
}
set_fs(old_fs);
An example of an entire module that reads the file /etc/shadow and dumps it out to the kernel system log, proving that this can be a dangerous thing to do, can be seen below:
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/syscalls.h>
#include <linux/fcntl.h>
#include <asm/uaccess.h>
static void read_file(char *filename)
{
int fd;
char buf[1];
mm_segment_t old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_RDONLY, 0);
if (fd >= 0) {
printk(KERN_DEBUG);
while (sys_read(fd, buf, 1) == 1)
printk("%c", buf[0]);
printk("\n");
sys_close(fd);
}
set_fs(old_fs);
}
static int __init init(void)
{
read_file("/etc/shadow");
return 0;
}
static void __exit exit(void)
{ }
MODULE_LICENSE("GPL");
module_init(init);
module_exit(exit);
Now, armed with this newfound knowledge of how to abuse the kernel system call API and annoy a kernel programmer at the drop of a hat, you really can push your luck and write to a file from within the kernel. Fire up your favorite editor, and pound out something like the following:
old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_WRONLY|O_CREAT, 0644);
if (fd >= 0) {
sys_write(data, strlen(data);
sys_close(fd);
}
set_fs(old_fs);
The code seems to build properly, with no compile time warnings, but when you try to load the module, you get this odd error:
insmod: error inserting 'evil.ko': -1 Unknown symbol in module
This means that a symbol your module is trying to use has not been exported and is not available in the kernel. By looking at the kernel log, you can determine what symbol that is:
evil: Unknown symbol sys_write
So, even though the function sys_write is present in the syscalls.h header file, it is not exported for use in a kernel module. Actually, on three different platforms this symbol is exported, but who really uses a parisc architecture anyway? To work around this, we need to take advantage of the kernel functions that are available to kernel modules. By reading the code of how the sys_write function is implemented, the lack of the exported symbol can be thwarted. The following kernel module shows how this can be done by not using the sys_write call:
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/syscalls.h>
#include <linux/file.h>
#include <linux/fs.h>
#include <linux/fcntl.h>
#include <asm/uaccess.h>
static void write_file(char *filename, char *data)
{
struct file *file;
loff_t pos = 0;
int fd;
mm_segment_t old_fs = get_fs();
set_fs(KERNEL_DS);
fd = sys_open(filename, O_WRONLY|O_CREAT, 0644);
if (fd >= 0) {
sys_write(fd, data, strlen(data));
file = fget(fd);
if (file) {
vfs_write(file, data, strlen(data), &pos);
fput(file);
}
sys_close(fd);
}
set_fs(old_fs);
}
static int __init init(void)
{
write_file("/tmp/test", "Evil file.\n");
return 0;
}
static void __exit exit(void)
{ }
MODULE_LICENSE("GPL");
module_init(init);
module_exit(exit);
As you can see, by using the functions fget, fput and vfs_write, we can implement our own sys_write functionality.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Developer Poll
- Dart: a New Web Programming Experience
- What's the tweeting protocol?
- New Products
- Thanks for taking the time to
44 min 13 sec ago - Linux is good
2 hours 41 min ago - Reply to comment | Linux Journal
2 hours 59 min ago - Web Hosting IQ
3 hours 29 min ago - Web Hosting IQ
3 hours 29 min ago - Web Hosting IQ
3 hours 30 min ago - Reply to comment | Linux Journal
6 hours 30 min ago - play with linux? i think you mean work-around linux
14 hours 57 min ago - Where is Epistle?
15 hours 2 min ago - You forgot OwnCloud
15 hours 32 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




Comments
I think there is a typo on the write code...
I think there's a typo on the final write code. The first call after checking for the file descriptor should be vfs_write and no the sys_write call.
fd = sys_open(filename, O_WRONLY|O_CREAT, 0644);
if (fd >= 0) {
vfs_write(fd, data, strlen(data));
about usb drivers
I am writing usb device driver and i want some help.
OS that i am using is suse10.2. If i write down complete usb driver that is mentioned in your documents, what are the extra things that i need to do apart from inserting that module in to kernel (insmod).If i insert my usb driver module in to kernel how can we stop kernel from using previous usb driver and now kernel should use the driver(module) that i have inserted.
Similarly i have also written mouse device driver for ps2 mouse.But i faced the simiar problem.How we should tell to kernel that dont use previous driver for mouse use that i have inserted.
Please Help Me.If i could get way for this i will be able to run both the drivers.
Yeah, the kernel has
Yeah, the kernel has internal API to open files: filp_open(), filp_close(), vfs_read(), ...
Consult for example with sound/sound_firmware.c.
In proprietary spec I have
In proprietary spec I have
int ioctl(int fd, int req=DMX_SET_SOURCE, int *frontend_fd);
where frontend_fd is the pointer to file descriptor of another driver opened previousely.
The logic here - the user is responsible for which driver to connect for internal communication
Anyway I need sys_ioctl to call one driver from another. I can not parse user file descriptor to determine which driver I need.
Don't use sys_*()
If you do read/write files from the kernel, then please do it without too much hacks, meaning you use
filp_open,vfs_readandvfs_writeinstead ofsys_open,sys_readandsys_write, respectively. Thanks!Hey you said that you can
Hey you said that you can use filp_open , vfs_read and vfs_write. I tried that but I could not write and read into the file.
static void write_file(char *filename, char *data)
{
struct file *file;
file = filp_open(filename, O_WRONLY|O_CREAT, 0777);
ssize_t wc;
wc = vfs_write(file, data, strlen(data),0);
if(wc < strlen(data))
{
printk(KERN_INFO "Problem in Writing Data\n");
}
filp_close(file,0);
}
This is my code.
Please let me know where I am wrong.
Thanks,
Translation
Wow... Can someone translate this into English for me?
How very clever
Greg: thanks for this summary. And how clever of you to obscure your advice concerning file writes from kernel space by including the sys_write() call in your sample, misdirecting readers into ignoring the vfs_write() and thinking that they would still need that syscall.
err
Your final example still uses sys_write and the snippet above that has a syntax error in it.
Kernel
God, I hate re-writing kernels. I used to use suse linux. Everytime I had to get my wireless card to work, I would have to re-write the kernel and that shit takes forever.
Rides and Whips
Clearly, you know a lot
Clearly, you know a lot about linux. You don't re-write a kernel. You recompile it with drivers or code for your device. If you were rewriting a kernel that shit would take forever. Recompiling a kernel takes me only a few minutes on my machine, granted, it's faster than most. Keep anti-linux comments on topic and try to back them up with fact. The simple fact is, you don't need to make any thing up to criticize any operating system, there is plenty wrong with any given one to be criticized.
You had to re-write the
You had to re-write the kernel everything you wanted your wifi card to work?
vince
everytime*
everytime*
Not using sys_write
"how this can be done by not using the sys_write call"
But you've used it right there!
Stacked Drivers Concept
I know one case when this is the cleanest practical solution.
Linux kernel is supposed to be build on "stacked drivers" concept. In reality, kernel supports it in very limited number of specially designed core drivers. Try to write a translation driver for a device which already has a generic driver. Good example is having a USB-to-serial chip talking to a custom hardware. Now, anyone trying to create a driver for that custom hardware is facing a huge uphill battle with the kernel, unless you can just open device file created by generic driver and talk to it in vfs. However, vfs_read/vfs_write are user-space restricted. (Unacceptable alternatives: user-space daemon, full-blown rewrite of USB-to-serial driver, etc.)
Using technique in this article proves to be the only way for given situation.
You've failed to explain why
You've failed to explain why you can't solve the problem in userspace in your example.
What's wrong with a libmycustomhardware.so?
Problem with sys_write and sys_open
hi,
information given in this article is very useful.
i have tried above codes on kernel 2.6.9, but getting error with sys_open and sys_write.
it gives error in system log file:
fileops: Unknown symbol sys_open
fileops: Unknown symbol sys_write
Please suggest me how can i overcome this problem.
I welcome your suggestions.
Thank you
dont use sys_read or
dont use sys_read or sys_write or sys_open or sys_open calls.. instead u can use vfs_read, vfs_write, filp_open, filp_close calls..
don't do that!
don't do that!
(sorry)
I think using sys_open and
I think using sys_open and sys_read is more general than vfs_open vfs_read. In my case, I need to create to module to read a file from SD card (fat), vfs_read always give me many errors.
In thread
Could this function be called by Thread ?
Used this to interface a stack with the DVBS drivers of Linux
Used this to interface a stack with the DVBS drivers of Linux.
Good Job!!
Hope to see you in person some day!!!
i guess filpopen should work
i guess filpopen should work ...
Did u look into copyfromuser() and copytouser()
These are useful for getting data back and forth between kernel and user space.
netlink sockets are the very
netlink sockets are the very efficient way to communicate between user, kernel modules and vice-versa.. Its very easy to use also..