Manipulating the Networking Environment Using RTNETLINK
NETLINK is a facility in the Linux operating system for user-space applications to communicate with the kernel. NETLINK is an extension of the standard socket implementation. Using NETLINK, an application can send/receive information to/from different kernel features, such as networking, to check current status and control them.
In this article, I describe how a programmer can use the networking environment manipulation capability of NETLINK known as RTNETLINK. I discuss some areas of use of RTNETLINK, the relevant socket operations, the functionality, how RTNETLINK messages are formed and finally, provide a set of sample code that uses RTNETLINK. RTNETLINK for the IP version 4 environment is referred to as NETLINK_ROUTE, and for the IP version 6 environment, it is referred to as NETLINK_ROUTE6. The explanations given here are applicable for both IP versions 4 and 6.
Developers of network layer protocol handlers can use RTNETLINK to modify and monitor different components of networking, such as the routing table and network interfaces. There are many existing and upcoming protocol standards at the Internet Engineering Task Force (IETF) that can be implemented in user space. These implementations will require manipulating the routing and knowing what is being modified by other processes. Some of these protocol categories are as follows:
Dynamic routing protocols: protocols of this category, including the Routing Information Protocol (RIP), Open Shortest Path First (OSPF) and Exterior Gateway Protocol (EGP) actively manage the routing environment of a host while communicating with other equally capable hosts or routers in the network or Internet.
Mobility protocols: hosts that are mobile and connect to different networks at different times use protocols such as Mobile IP (MIP), Session Initiation Protocol (SIP) and Network Mobility (NEMO) to manage routing to maintain connectivity and continuity of communications.
Ad hoc networking protocols: hosts that are mobile and located in places where there is no networking infrastructure, such as routers and WLAN access points, require peer-to-peer communications with differently configured hosts. Mobile computers of rescue workers in an earthquake-struck area or other such emergencies can use ad hoc networking protocols. These protocols, such as the Ad hoc On-demand Distance Vector (AODV) and Optimized Link State Routing (OLSR), require managing the routing to find and communicate with other hosts using neighboring hosts as routers and gateways.
It helps reduce the complexity of the kernel code if you implement these protocols in user space. Further, it simplifies the development and testing of these protocols because of the availability of many user-space development tools. Problems, such as kernel crashes, that are likely with kernel-based code when testing or when used by end users will not occur in a user-space protocol handler.
The socket implementation of Linux allows two end points to communicate. The socket API provides a standard set of functions and data structures. With RTNETLINK, the two end points in communication are user space and kernel space. The following sequence of socket calls have to be made when manipulating the networking environment through RTNETLINK:
Open socket.
Bind socket to local address (using process ID).
Send message to the other end point.
Receive message from the other end point.
Close socket.
The socket() function opens an unattached end point to communicate with the kernel. The function prototype of this call is as follows:
int socket(int domain, int type, int protocol);
The domain refers to what type of socket is being used. For RTNETLINK, we use AF_NETLINK (PF_NETLINK). type refers to the type of protocol used when communicating. This can be raw (SOCK_RAW) or datagram (SOCK_DGRAM). This is not relevant for RTNETLINK sockets and either can be used. protocol refers to the exact NETLINK capability that we use; in our case, it is NETLINK_ROUTE. This function returns an integer with a positive number called the socket descriptor, if the socket opening was successful. This descriptor will be used in all the future RTNETLINK calls until the socket is closed. If there was a failure, a negative value is returned, and the system error variable errno included in errno.h is set to the appropriate error code.
The following is an example of a call to open an RTNETLINK socket:
int fd; ... fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
Once the socket is opened, it has to be bound to a local address. The user application can use a unique 32-bit ID to identify the local address. The function prototype of bind is as follows:
int bind(int fd, struct sockaddr *my_addr,
socklen_t addrlen);
To bind, the caller must provide a local address using the sockaddr_nl structure. This structure in the linux/netlink.h #include file has the following format:
struct sockaddr_nl
{
sa_family_t nl_family; // AF_NETLINK
unsigned short nl_pad; // zero
__u32 nl_pid; // process pid
__u32 nl_groups; // multicast grps mask
};
The nl_pid must contain a unique ID, which can be created using the return of the getpid() function. This function returns the process ID of the current user process that opened the RTNETLINK socket. But, if our process consists of multiple threads with each thread opening different RTNETLINK sockets, a modified process ID can be used.
Once this structure is filled, the binding can be done. The bind function returns zero if the operation succeeded. A negative number is returned in the case of failure, and the system error variable is set. The following is an example of calling bind:
struct sockaddr_nl la; ... bzero(&la, sizeof(la)); la.nl_family = AF_NETLINK; la.nl_pad = 0; la.nl_pid = getpid(); la.nl_groups = 0; rtn = bind(fd, (struct sockaddr*) &la, sizeof(la));
If the operation you require is multicast-based, you must set nl_groups to join the multicast group associated with the required RTNETLINK operation. For example, if you want to be notified of the changes to the routing table by other processes, you must OR (|) the RTMGRP_IPV4_ROUTE and RTMGRP_NOTIFY.
Sending routing RTNETLINK messages to the kernel is done through the use of the standard sendmsg() function of the socket interface. The following is the prototype of this function:
ssize_t sendmsg(int fd, const struct msghdr *msg,
int flags);
msg is a pointer to a msghdr structure. The following is the format of this structure:
struct msghdr
{
void *msg_name; //Address to send to
socklen_t msg_namelen; //Length of address data
struct iovec *msg_iov; //Vector of data to send
size_t msg_iovlen; //Number of iovec entries
void *msg_control; //Ancillary data
size_t msg_controllen; //Ancillary data buf len
int msg_flags; //Flags on received msg
};
The msg_name is a pointer to a variable of the type struct sockaddr_nl. This is the destination address of the sendmsg() function. Because this message is directed to the kernel, all variables of sockaddr_nl will be initialized to zero, except the nl_family member variable. The field msg_namelen should contain the size of a struct sockaddr_nl.
msg_iov should contain a pointer to a struct iovec, which is filled with the RTNETLINK message relevant to the request being made. The caller is allowed to place multiple RTNETLINK requests, if required. msg_iovlen points to the number of struct iovec structures that were placed in msg_iov. The rest of the variables are initialized to zero.
To receive RTNETLINK messages, the recv() function is used. Here is the prototype of this function:
ssize_t recv(int fd, void *buf, size_t len,
int flags);
The second and third variables are a pointer to a buffer to place the bytes read and the length of this buffer, respectively. For RTNETLINK, the buffer will contain a set of RTNETLINK messages that have to be read one after the other using a set of macros provided in the netlink.h and rtnetlink.h #include files. flags is a set of flags to indicate how the receive should be performed. For RTNETLINK, this simply can be initialized to zero.
Once the socket communications are complete, the socket has to be closed using the close() function. Here's the prototype of this function:
int close(int fd);
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Linux Systems Administrator
- New Products
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Have you tried Boxen? It's a
1 hour 54 min ago - seo services in india
6 hours 25 min ago - For KDE install kio-mtp
6 hours 26 min ago - Evernote is much more...
8 hours 26 min ago - Reply to comment | Linux Journal
17 hours 11 min ago - Dynamic DNS
17 hours 45 min ago - Reply to comment | Linux Journal
18 hours 44 min ago - Reply to comment | Linux Journal
19 hours 34 min ago - Not free anymore
23 hours 36 min ago - Great
1 day 3 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
wireless link
Is there any way to monitor the wireless link up and down using rtnetlink ? If yes, what all the parameters required to change ?
Getting default gateway
Is it possible to get the default gateway with out having to create a route? I am trying to get the default gateways ip with out having to add/delete to the routing table.
Please let me know.. and Thank You for this article it was great!
double event found on netlink socket
hi All,
I M new for netlink socket programming. I am developing simple application which inform me when ever any Interface is make up/down using if/up/down/config or wire out from Link plug. now problem is i got two packet for every if/up/down or wire out event.
I am not able to solve problem and don't understand why this things happen.
I m using "nl_groups = RTNLGRP_LINK" only.
Thanks.
Route add does not work.
Using this tutorial, i crated a function to add the route, I believe I am populating all the necessary elements of the data structures. However, the route is getting added wrongly. For any kind of route, the function only add 0.0.0.0 route with mask 255.255.255.255 and gateway 0.0.0.0. It points to the correct interface that i specify in RTA_OIF.
Here is the function.
unsigned int rtm_add_v4 (unsigned int prefix,
unsigned char len, u_char tbl_index,
unsigned int oif,
u_char proto,
u_char rt_type,
struct rtnexthop *rtnh)
{
struct sockaddr_nl ra;
struct msghdr msg;
struct iovec iov;
char buf[8192];
int rtn;
struct nlmsghdr *nlm;
int nlml;
struct rtmsg *rt;
int rtl;
struct rtattr *rta;
rtsock_req_t rreq;
assert(rtm_initialized);
bzero(&rreq, sizeof(rreq));
rtl = sizeof(struct rtmsg);
rta = (struct rtattr *)rreq.buf;
rta->rta_type = RTA_DST;
rta->rta_len = sizeof(struct rtattr) + 4;
printf("Copying prefix 0x%08x\n", prefix);
bcopy(&prefix,(char *)rta+rta->rta_len,4);
rtl += rta->rta_len;
rta = (struct rtattr *)(((char *)rta) + rta->rta_len);
rta->rta_type = RTA_OIF;
rta->rta_len = sizeof(struct rtattr) + 4;
printf("Copying OIF: %d\n", oif);
bcopy(&oif, (char *)rta+sizeof(struct rtattr), 4);
rtl += rta->rta_len;
/* Setup the NETLINK Header */
rreq.nl.nlmsg_len = NLMSG_LENGTH(rtl);
rreq.nl.nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE;
rreq.nl.nlmsg_type = RTM_NEWROUTE;
/* Setup operation header */
rreq.rt.rtm_family = AF_INET;
rreq.rt.rtm_dst_len = len;
rreq.rt.rtm_table = tbl_index;
rreq.rt.rtm_protocol = proto;
rreq.rt.rtm_scope = RT_SCOPE_UNIVERSE;
rreq.rt.rtm_type = rt_type;
bzero(&ra, sizeof(ra));
ra.nl_family = AF_NETLINK;
bzero(&msg, sizeof(msg));
msg.msg_name = (void *)&ra;
msg.msg_namelen = sizeof(ra);
iov.iov_base = (void *)&rreq.nl;
iov.iov_len = rreq.nl.nlmsg_len;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
rtn = sendmsg(rtsock,&msg, 0);
if (rtn < 0)
{
printf("%s :", __FUNCTION__);
perror("sendmsg");
printf("\n");
return 0;
} else {
return 1;
}
return 0;
}
Here is the routing table before route add
[root@iLinux-Nilesh route]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.2.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
172.19.57.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.60.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 172.19.57.1 0.0.0.0 UG 0 0 0 eth0
Here is the sample run of the test program that uses this function.
Enter the route to be added: 172.21.1.1
Addr: 0xac150101
Enter prefix len: 32
len: 32
Enter the oif: 2
Copying prefix 0xac150101
Copying OIF: 2
ROUTE ADDED SUCCESSFULLY
[root@iLinux-Nilesh route]#
Routing table after route add program.
[root@iLinux-Nilesh route]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 <<<<<
10.2.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
172.19.57.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.60.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 172.19.57.1 0.0.0.0 UG 0 0 0 eth0
[root@iLinux-Nilesh route]#
Any idea what is wrong? Function in test program is invoked as below.
rc = rtm_add_v4(addr.s_addr,len,RT_TABLE_MAIN,oif,RTPROT_STATIC,RTN_UNICAST,NULL);
Never mind..found the
Never mind..found the problem.
Hello Asanga... I am
Hello Asanga...
I am newbie to Linux. As far as u have googled i found only your material for a sample.With my understanding on ur illustration I have made the below module to get the destination address and the gateway address when i give "route add -host 192.168.2.45 gw 202.34.2.1"
I get the gateway address as 192.168.2.45 from the module. but i expect the gateway to be 202.34.2.1...
Please help me on this regard.
Thanks in advance...
/* Read message from kernel */
recv(sock_fd,nlh,size, 0);
printf(" Received message payload: %s\n",
NLMSG_DATA(nlh));
rtp = (struct rtmsg*)NLMSG_DATA(nlh);
rtap =(struct rtattr*) RTM_RTA(rtp);
rtl = RTM_PAYLOAD(nlh);
for(;RTA_OK(rtap,rtl);rtap = RTA_NEXT(rtap,rtl))
{
switch(rtap_rta_type)
{
case RTA_GATEWAY:
inet_ntop(AF_INET,RTA_DATA(rtap),gws,24);
printf("\n gateway address is %s",gws);
break;
case RTA_DST:
inet_ntop(AF_INET,RTA_DATA(rtap),dsts,24);
printf("\n destination address is %s",dsts);
break;
case RTA_SRC:
printf("\n received source address");
break;
default:
break;
}
}
Getting the IPV6 Address of a Device via rtnetlink
Hi, I currently have the problem that i want to get the IPV6 Address of a device via rtnetlink. Your article was already very helpful, but I still cannot find out, which fields of struct ifaddrmsg I have to fill out if I pass it with a request so that i get the IP that I am looking for. I set ifa_family to AF_INET6 and ifa_index to the device that I am looking at. Nevertheless when parsing the "answer" buffer so to speak I get NLMSG_ERROR and nothing else. Well and there is the problem that the programm never does more than one iteration in while(1) but it also never leaves it .. well that is a different problem I guess. Still, I would like to know why you put the second break condition in there, is it not always true? You didn't even set nl_groups in your programm ?
Sorry for my bad English but it has been a frustrating day full of debugging.
Greetings from Switzerland,
S.
Re. Getting the IPV6 Address of a Device via rtnetlink
Hello,
Here is the request init part of a sample RTNETLINK program that shows the IPv6 address info of interfaces.
bzero(&local, sizeof(local));
local.nl_family = AF_NETLINK;
local.nl_pid = getpid();
if(bind(fd, (struct sockaddr*) &local, sizeof(local)) < 0) {
printf("Error in sock bind\n");
exit(1);
}
bzero(&peer, sizeof(peer));
peer.nl_family = AF_NETLINK;
bzero(&msg_info, sizeof(msg_info));
msg_info.msg_name = (void *) &peer;
msg_info.msg_namelen = sizeof(peer);
bzero(&netlink_req, sizeof(netlink_req));
netlink_req.nlmsg_info.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifaddrmsg));
netlink_req.nlmsg_info.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
netlink_req.nlmsg_info.nlmsg_type = RTM_GETADDR;
netlink_req.nlmsg_info.nlmsg_pid = getpid();
netlink_req.ifaddrmsg_info.ifa_family = AF_INET6;
iov_info.iov_base = (void *) &netlink_req.nlmsg_info;
iov_info.iov_len = netlink_req.nlmsg_info.nlmsg_len;
msg_info.msg_iov = &iov_info;
msg_info.msg_iovlen = 1;
rtn = sendmsg(fd, &msg_info, 0);
If you require the whole code, look in my home page.
Re. second break; Usually the end of a returned message is indicated by a NLMSG_DONE. But for monitoring of routing table changes, this will not work. Since the example code in this article was common, that second break is also part of the loop.
Kind regards,
Asanga
Getting route updates
Hi..
This document was really helpfull. Thanks a lot.
I have a question regarding receiving route updates from the kernel.
I have a process that waits for any routing table changes. It is able to get updates when ever a new route is added or deleted. It gets arround 52 bytes of data.
When I add a new route entry I get 52 bytes but, it fails to enter
"for(;NLMSG_OK(nlp, nll);nlp=NLMSG_NEXT(nlp, nll))" loop of read_reply() as given in this document and more over when I try to print "nlp->nlmsg_type" its always RTM_NEWROUTE even though I deleted a route entry in my previous operation.
What I want is...
1) When ever a new entry gets added read_reply() function should print the new entry that got added.
2) When ever a entry is deleted from the route table, it should print the entry that got deleted as well as nlp->nlmsg_type shud be RTM_DELROUTE so that I know that the netlink message I got is because of delete operation.
Your help in this regard will be appreciated.
Thanks and regards,
Nagendra KS.
re: Getting route updates
Hello,
I added the statement
printf("Type %d\n", nlp->nlmsg_type);
just after the statement
nlp = (struct nlmsghdr *) buf;
in the mon_routing_table.c file and I see 25 (RTM_DELROUTE) for a route delete and 24 (RTM_ADDROUTE) for a route add.
Kind regards,
Asanga
Flush Cache
It is possible to flush the route cache via rtnetlink sockets?
re: Flush Cache
Hello,
> It is possible to flush the route cache via rtnetlink sockets?
As far as I know, there isn't any RTNETLINK command to flush the routing cache. But after looking at the source code of the "ip" command suit I found that they write a -1 to
/proc/sys/net/ipv4/route/flush to flush the routing cache.
Kind regards,
Asanga
How to specify a NIC?
Hallo,
this article helps me to understand the way to implement a protocol.
But some questions are still confusing me.
If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?
Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?
Is there a Windows-Version of NETLINK & RTNETLINK ?
thanks in advance
Re: How to specify a NIC?
Hello,
>If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?
I assume that you are asking about sending IP packets over an interface. If that is the case, you must use INET type sockets to do this.
> Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?
What interface a packet takes, is usually decided by the routing table, depending on the destination address of the packet. But I think INET sockets also has a facility to send packets from a given interface (thru sendmsg())
> Is there a Windows-Version of NETLINK & RTNETLINK ?
As far as I know, no (atleast up to XP)
Regards,
Asanga
Via gateway
I'm having problems adding a route via a gateway, i tried to add it but it simply wont work.
This is the code i'm adding after the iface
rtap= (struct rtattr *) (((char *)rtap) + sizeof(struct rtattr));
rtap->rta_type=RTA_GATEWAY;
rtap->rta_len = sizeof (struct rtattr) + 4;
inet_pton(AF_INET, gw, ((char *)rtap) + sizeof (struct rtattr));
rtl +=rtap->rta_len;
Thanks
Re: Via gateway
Hello Miguel,
Here is the code to add the gateway to a route,
rtap = (struct rtattr *) (((char *)rtap)
+ rtap->rta_len);
rtap->rta_type = RTA_GATEWAY;
rtap->rta_len = sizeof(struct rtattr) + 4;
inet_pton(AF_INET, gw,
((char *)rtap) + sizeof(struct rtattr));
rtl += rtap->rta_len;
Your code piece is almost the same except for the rta_len addition. If there is no problem here, also check whether you can add the same route entry that you are trying to add programatically using ip route add command. A frequent problem of adding gateways to routes is that the gateway should be reachable.
Kind regards,
Asanga