Kernel Korner - Network Programming in the Kernel
Let's examine the code for the module first. In the code snippets in the article, we omit error-checking and other irrelevant details for clarity. The complete code is available from the LJ FTP site (see Resources):
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
/* For socket etc */
#include <linux/net.h>
#include <net/sock.h>
#include <linux/tcp.h>
#include <linux/in.h>
#include <asm/uaccess.h>
#include <linux/file.h>
#include <linux/socket.h>
#include <linux/smp_lock.h>
#include <linux/slab.h>
...
int ftp_init(void)
{
printk(KERN_INFO FTP_STRING
"Starting ftp client module\n");
sys_call_table[SYSCALL_NUM] = my_sys_call;
return 0;
}
void ftp_exit(void)
{
printk(KERN_INFO FTP_STRING
"Cleaning up ftp client module, bye !\n");
sys_call_table[SYSCALL_NUM] = sys_ni_syscall;
}
...
The program begins with the customary include directives. Notable among the header files are linux/kernel.h for KERN_ALERT and linux/slab.h, which contains definitions for kmalloc() and linux/smp_lock.h that define kernel-locking routines. System calls are handled in the kernel by functions with the same names in user space but are prefixed with sys_. For example, the sys_socket function in the kernel handles the task of the socket() system call. In this module, we are using system call number 223 for our new system call. This method is not foolproof and will not work on SMP machines. Upon unloading the module, we unregister our system call.
The workhorse of the module is the new system call that performs an FTP read. The system call takes a structure as a parameter. The structure is self-explanatory and is given below:
struct params {
/* Destination IP address */
unsigned char destip[4];
/* Source IP address */
unsigned char srcip[4];
/* Source file - file to be downloaded from
the server */
char src[64];
/* Destination file - local file where the
downloaded file is copied */
char dst[64];
char user[16]; /* Username */
char pass[64]; /* Password */
};
The system call is given below. We explain the relevant details in next few paragraphs:
asmlinkage int my_sys_call
(struct params __user *pm)
{
struct sockaddr_in saddr, daddr;
struct socket *control= NULL;
struct socket *data = NULL;
struct socket *new_sock = NULL;
int r = -1;
char *response = kmalloc(SNDBUF, GFP_KERNEL);
char *reply = kmalloc(RCVBUF, GFP_KERNEL);
struct params pmk;
if(unlikely(!access_ok(VERIFY_READ,
pm, sizeof(pm))))
return -EFAULT;
if(copy_from_user(&pmk, pm,
sizeof(struct params)))
return -EFAULT;
if(current->uid != 0)
return r;
r = sock_create(PF_INET, SOCK_STREAM,
IPPROTO_TCP, &control);
memset(&servaddr,0, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(PORT);
servaddr.sin_addr.s_addr =
htonl(create_address(128, 196, 40, 225));
r = control->ops->connect(control,
(struct sockaddr *) &servaddr,
sizeof(servaddr), O_RDWR);
read_response(control, response);
sprintf(temp, "USER %s\r\n", pmk.user);
send_reply(control, temp);
read_response(control, response);
sprintf(temp, "PASS %s\r\n", pmk.pass);
send_reply(control, temp);
read_response(control, response);
We start out by declaring pointers to a few socket structures. kmalloc() is the kernel equivalent of malloc() and is used to allocate memory for our character array. The array's response and reply will contain the responses to and replies from the server.
The first step is to read the parameters from user mode to kernel mode. This is customarily done with access_ok and verify_read/verify_write calls. access_ok checks whether the user-space pointer is valid to be referenced. verify_read is used to read data from user mode. For reading simple variables like char and int, use __get_user.
Now that we have the user-specified parameters, the next step is to create a control socket and establish a connection with the FTP server. sock_create() does this for us—its arguments are similar to those we pass to the user-level socket() system call. The struct sockaddr_in variable servaddr is now filled in with all the necessary information—address family, destination port and IP address of the server. Each socket structure has a member that is a pointer to a structure of type struct proto_ops. This structure contains a list of function pointers to all the operations that can be performed on a socket. We use the connect() function of this structure to establish a connection to the server. Our functions read_response() and send_reply() transfer data between the client and server (these functions are explained later):
r = sock_create(PF_INET, SOCK_STREAM,
IPPROTO_TCP, &data);
memset(&claddr,0, sizeof(claddr));
claddr.sin_family = AF_INET;
claddr.sin_port = htons(EPH_PORT);
clddr.sin_addr.s_addr= htonl(
create_address(srcip));
r = data->ops->bind(data,
(struct sockaddr *)&claddr,
sizeof (claddr));
r = data->ops->listen(data, 1);
Now, a data socket is created to transfer data between the client and server. We fill in another struct sockaddr_in variable claddr with information about the client—protocol family, local unprivileged port that our client would bind to and, of course, the IP address. Next, the socket is bound to the ephemeral port EPH_PORT. The function listen() lets the kernel know that this socket can accept incoming connections:
a = (char *)&claddr.sin_addr; p = (char *)&claddr.sin_port; send_reply(control, reply); read_response(control, response); strcpy(reply, "RETR "); strcat(reply, src); strcat(reply, "\r\n"); send_reply(control, reply); read_response(control, response);
As explained previously, a PORT command is issued to the FTP server to let it know the port for data transfer. This command is sent over the control socket and not over the data socket:
new_sock = sock_alloc();
new_sock->type = data->type;
new_sock->ops = data->ops;
r = data->ops->accept(data, new_sock, 0);
new_sock->ops->getname(new_sock,
(struct sockaddr *)address, &len, 2);
Now, the client is ready to accept data from the server. We create a new socket and assign it the same type and ops as our data socket. The accept() function pulls the first pending connection in the listen queue and creates a new socket with the same connection properties as data. The new socket thus created handles all data transfer between the client and server. The getname() function gets the address at the other end of the socket. The last three lines in the above segment of code are useful only for printing information about the server:
if((total_written = write_to_file(pmk.dst,
new_sock, response)) < 0)
goto err3;
The function write_to_file deals with opening a file in the kernel and writing data from the socket back into the file. Writing to sockets works like this:
void send_reply(struct socket *sock, char *str)
{
send_sync_buf(sock, str, strlen(str),
MSG_DONTWAIT);
}
int send_sync_buf
(struct socket *sock, const char *buf,
const size_t length, unsigned long flags)
{
struct msghdr msg;
struct iovec iov;
int len, written = 0, left = length;
mm_segment_t oldmm;
msg.msg_name = 0;
msg.msg_namelen = 0;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_control = NULL;
msg.msg_controllen = 0;
msg.msg_flags = flags;
oldmm = get_fs(); set_fs(KERNEL_DS);
repeat_send:
msg.msg_iov->iov_len = left;
msg.msg_iov->iov_base = (char *) buf +
written;
len = sock_sendmsg(sock, &msg, left);
...
return written ? written : len;
}
The send_reply() function calls send_sync_buf(), which does the real job of sending the message by calling sock_sendmsg(). The function sock_sendmsg() takes a pointer to struct socket, the message to be sent and the message length. The message is represented by the struture msghdr. One of the important members of this structure is iov (io vector). The iovector has two members, iov_base and iov_len:
struct iovec
{
/* Should point to message buffer */
void *iov_base;
/* Message length */
__kernel_size_t iov_len;
};
These members are filled with appropriate values, and sock_sendmsg() is called to send the message.
The macro set_fs is used to set the FS register to point to the kernel data segment. This allows sock_sendmsg() to find the data in the kernel data segment instead of the user-space data segment. The macro get_fs saves the old value of FS. After a call to sock_sendmsg(), the saved value of FS is restored.
Reading from the socket works similarly:
int read_response(struct socket *sock, char *str)
{
...
len = sock_recvmsg(sock, &msg,
max_size, 0);
...
return len;
}
The read_response() function is similar to send_reply(). After filling the msghdr structure appropriately, it uses sock_recvmsg() to read data from a socket and returns the number of bytes read.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Introduction to MapReduce with Hadoop on Linux
- One advantage with VMs
6 min 3 sec ago - about info
39 min 12 sec ago - info
40 min 11 sec ago - info
41 min 5 sec ago - info
43 min 10 sec ago - info
44 min 14 sec ago - abut info
45 min 55 sec ago - info
46 min 54 sec ago - info
48 min 26 sec ago - info
49 min 19 sec ago
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Fail to create dynamic system call
Hi everyone,
Has anyone tried this code themselves? I failed to insert that system call dynamically (but succeeded to compile and insmod the module).
I am wondering if anyone know what's going on there.
PS: I am using Linux 2.6.29 in Ubuntu.
Thanks in advance,
-Kunsheng
kernel raw sockets
Hello,
I am new to linux kernel development. so if any mistakes you find, pls
frgive it and correct me.
I wanted to send raw packets through ethernet, from kernel level.
So i use PF_PACKET family. & SOCK_RAW.And i used sock_create()
function to create socket.
But I found that when i create socket with
sock_create(PF_PACKET,SOCK_RAW,.....) the program always fails in
bind. (when i do sock->ops->bind(.....))
why is it so ? but when I use PF_INET & SOCK_PACKET to create socket.
bind happens successfully.
Can any one help me to come out of this issue?? OR direct me to create
raw packets and send from kernel??
thanks in advance
-Anuroop
create_address doesn't exist
The function create_address doesn't exist in my kernel header (2.6.17). Where should I get the definition?
Thanks,
Derek
It's true.Create_address is
It's true.Create_address is not in Kernel Header.
Has anyone found how to get it?
Please reply
insmod 'ing the module
In your example you don't actually use insmod after building the module, does that mean its not necessary? If not then how does the userland program see the system call. If so then do you know why it insmod'ing it would freeze my system? Cause it does. I did fiddle with the code a bit, mostly stripped it down to just connect, send a message, and close.
Thanks
Nathan
Problem was redefining the
Problem was redefining the system call. Seems linux doesn't appreciate it none too much and freezes. I've read its not really done anymore anyway.
Nathan