Manipulating the Networking Environment Using RTNETLINK

How to use RTNETLINK to develop applications that control networking.
RTNETLINK Sample Walk-Through

The sample code presented here focuses on three of the operations that can be performed on the routing table:

  • get_routing_table: reads the main routing table in the system.

  • set_routing_table: inserts a new routing entry to the table.

  • mon_routing_table: monitors the routing table changes.

All three samples use a similar main() function that calls a set of subfunctions to form RTNETLINK messages and send, receive and process the received messages. To simplify the explanation, no error handling is considered. These samples perform on the IP version 4 environment of the system (AF_INET). Here is the main() function:


int main(int argc, char *argv[])
{

  // open socket
  fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);

  // setup local address & bind using
  // this address
  bzero(&la, sizeof(la));
  la.nl_family = AF_NETLINK;
  la.nl_pid = getpid();
  bind(fd, (struct sockaddr*) &la, sizeof(la));


  // sub functions to create RTNETLINK message,
  // send over socket, receive reply & process
  // message
  form_request();
  send_request();
  recv_reply();
  read_reply();

  // close socket
  close(fd);
}

Similar to the above function, the two functions that perform the socket communications are almost common to all the samples. These two functions simply send a formed message to the kernel and receive messages sent by the kernel. Exceptions here are the set_routing_table and mon_routing_table samples. In set_routing_table, a receive phase is not considered. In the mon_routing_table, a send phase is not present as it attempts to monitor only the state of the routing environment to see what is being changed. This information is mulitcast by the kernel to all the RTNETLINK sockets that are in the appropriate receiving state.

First, here's the code for send_request():


void send_request()
{
  // create the remote address
  // to communicate
  bzero(&pa, sizeof(pa));
  pa.nl_family = AF_NETLINK;

  // initialize & create the struct msghdr supplied
  // to the sendmsg() function
  bzero(&msg, sizeof(msg));
  msg.msg_name = (void *) &pa;
  msg.msg_namelen = sizeof(pa);

  // place the pointer & size of the RTNETLINK
  // message in the struct msghdr
  iov.iov_base = (void *) &req.nl;
  iov.iov_len = req.nl.nlmsg_len;
  msg.msg_iov = &iov;
  msg.msg_iovlen = 1;

  // send the RTNETLINK message to kernel
  rtn = sendmsg(fd, &msg, 0);
}

And, here's the recv_reply():


void recv_reply()
{
  char *p;

  // initialize the socket read buffer
  bzero(buf, sizeof(buf));

  p = buf;
  nll = 0;

  // read from the socket until the NLMSG_DONE is
  // returned in the type of the RTNETLINK message
  // or if it was a monitoring socket
  while(1) {
    rtn = recv(fd, p, sizeof(buf) - nll, 0);

    nlp = (struct nlmsghdr *) p;

    if(nlp->nlmsg_type == NLMSG_DONE)
      break;

    // increment the buffer pointer to place
    // next message
    p += rtn;

    // increment the total size by the size of
    // the last received message
    nll += rtn;

    if((la.nl_groups & RTMGRP_IPV4_ROUTE)
                      == RTMGRP_IPV4_ROUTE)
      break;
  }
}

The above functions and the following ones use a set of globally defined variables. These are used for all the socket operations as well as for forming and processing RTNETLINK messages:


// buffer to hold the RTNETLINK request
struct {
  struct nlmsghdr nl;
  struct rtmsg    rt;
  char            buf[8192];
} req;

// variables used for
// socket communications
int fd;
struct sockaddr_nl la;
struct sockaddr_nl pa;
struct msghdr msg;
struct iovec iov;
int rtn;

// buffer to hold the RTNETLINK reply(ies)
char buf[8192];

// RTNETLINK message pointers & lengths
// used when processing messages
struct nlmsghdr *nlp;
int nll;
struct rtmsg *rtp;
int rtl;
struct rtattr *rtap;

The get_routing_table sample retrieves the main routing table of the IPv4 environment. The form_request() function is as follows:


void form_request()
{
  // initialize the request buffer
  bzero(&req, sizeof(req));

  // set the NETLINK header
  req.nl.nlmsg_len
             = NLMSG_LENGTH(sizeof(struct rtmsg));
  req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
  req.nl.nlmsg_type = RTM_GETROUTE;

  // set the routing message header
  req.rt.rtm_family = AF_INET;
  req.rt.rtm_table = RT_TABLE_MAIN;
}

The received message for the RTNETLINK request in the buf variable to retrieve the routing table is processed by the read_reply() function. Here is the code of this function:


void read_reply()
{
  // string to hold content of the route
  // table (i.e. one entry)
  char dsts[24], gws[24], ifs[16], ms[24];

  // outer loop: loops thru all the NETLINK
  // headers that also include the route entry
  // header
  nlp = (struct nlmsghdr *) buf;
  for(;NLMSG_OK(nlp, nll);nlp=NLMSG_NEXT(nlp, nll))
  {

    // get route entry header
    rtp = (struct rtmsg *) NLMSG_DATA(nlp);

    // we are only concerned about the
    // main route table
    if(rtp->rtm_table != RT_TABLE_MAIN)
      continue;

    // init all the strings
    bzero(dsts, sizeof(dsts));
    bzero(gws, sizeof(gws));
    bzero(ifs, sizeof(ifs));
    bzero(ms, sizeof(ms));

    // inner loop: loop thru all the attributes of
    // one route entry
    rtap = (struct rtattr *) RTM_RTA(rtp);
    rtl = RTM_PAYLOAD(nlp);
    for(;RTA_OK(rtap, rtl);rtap=RTA_NEXT(rtap,rtl))
    {
      switch(rtap->rta_type)
      {
        // destination IPv4 address
        case RTA_DST:
          inet_ntop(AF_INET, RTA_DATA(rtap),
                                     dsts, 24);
          break;

        // next hop IPv4 address
        case RTA_GATEWAY:
          inet_ntop(AF_INET, RTA_DATA(rtap),
                                     gws, 24);
          break;

        // unique ID associated with the network
        // interface
        case RTA_OIF:
          sprintf(ifs, "%d",
                   *((int *) RTA_DATA(rtap)));
        default:
          break;
      }
    }
    sprintf(ms, "%d", rtp->rtm_dst_len);

    printf("dst %s/%s gw %s if %s\n",
                          dsts, ms, gws, ifs);
  }
}

The set_routing_table sample sends an RTNETLINK request to insert an entry to the routing table. The route entry that is inserted is a host route (32-bit network prefix) to a private IP address (192.168.0.100) through interface number 2. These values are defined in the variables dsts (destination IP address), ifcn (interface number) and pn (prefix length). You can run the get_routing_table sample to get an idea about the interface numbers and the IP network in your system. Here's the form_request():


void form_request()
{
  // attributes of the route entry
  char dsts[24] = "192.168.0.100";
  int ifcn = 2, pn = 32;

  // initialize RTNETLINK request buffer
  bzero(&req, sizeof(req));

  // compute the initial length of the
  // service request
  rtl = sizeof(struct rtmsg);

  // add first attrib:
  // set destination IP addr and increment the
  // RTNETLINK buffer size
  rtap = (struct rtattr *) req.buf;
  rtap->rta_type = RTA_DST;
  rtap->rta_len = sizeof(struct rtattr) + 4;
  inet_pton(AF_INET, dsts,
     ((char *)rtap) + sizeof(struct rtattr));
  rtl += rtap->rta_len;

  // add second attrib:
  // set ifc index and increment the size
  rtap = (struct rtattr *) (((char *)rtap)
            + rtap->rta_len);
  rtap->rta_type = RTA_OIF;
  rtap->rta_len = sizeof(struct rtattr) + 4;
  memcpy(((char *)rtap) + sizeof(struct rtattr),
           &ifcn, 4);
  rtl += rtap->rta_len;

  // setup the NETLINK header
  req.nl.nlmsg_len = NLMSG_LENGTH(rtl);
  req.nl.nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE;
  req.nl.nlmsg_type = RTM_NEWROUTE;

  // setup the service header (struct rtmsg)
  req.rt.rtm_family = AF_INET;
  req.rt.rtm_table = RT_TABLE_MAIN;
  req.rt.rtm_protocol = RTPROT_STATIC;
  req.rt.rtm_scope = RT_SCOPE_UNIVERSE;
  req.rt.rtm_type = RTN_UNICAST;
  // set the network prefix size
  req.rt.rtm_dst_len = pn;
}

The mon_routing_table sample reads the RTNETLINK messages received when other processes change the system's main routing table. This function will use the same read_reply() function to process the messages. The main() function requires a slight change. Because this operation involves listening to multicast messages of the kernel, the local address to which we bind, it also must include the two flags RTMGRP_IPV4_ROUTE and RTMGRP_NOTIFY. Here is the required change:


la.nl_groups = RTMGRP_IPV4_ROUTE | RTMGRP_NOTIFY;

Once mon_routing_table is executed, run a route add or a route del command from another shell prompt to see the results.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

wireless link

Pranab's picture

Is there any way to monitor the wireless link up and down using rtnetlink ? If yes, what all the parameters required to change ?

Getting default gateway

donX's picture

Is it possible to get the default gateway with out having to create a route? I am trying to get the default gateways ip with out having to add/delete to the routing table.

Please let me know.. and Thank You for this article it was great!

double event found on netlink socket

Anonymous's picture

hi All,

I M new for netlink socket programming. I am developing simple application which inform me when ever any Interface is make up/down using if/up/down/config or wire out from Link plug. now problem is i got two packet for every if/up/down or wire out event.

I am not able to solve problem and don't understand why this things happen.

I m using "nl_groups = RTNLGRP_LINK" only.

Thanks.

Route add does not work.

Nilesh's picture

Using this tutorial, i crated a function to add the route, I believe I am populating all the necessary elements of the data structures. However, the route is getting added wrongly. For any kind of route, the function only add 0.0.0.0 route with mask 255.255.255.255 and gateway 0.0.0.0. It points to the correct interface that i specify in RTA_OIF.

Here is the function.

unsigned int rtm_add_v4 (unsigned int prefix,
unsigned char len, u_char tbl_index,
unsigned int oif,
u_char proto,
u_char rt_type,
struct rtnexthop *rtnh)
{
struct sockaddr_nl ra;
struct msghdr msg;
struct iovec iov;
char buf[8192];
int rtn;

struct nlmsghdr *nlm;
int nlml;
struct rtmsg *rt;
int rtl;
struct rtattr *rta;
rtsock_req_t rreq;

assert(rtm_initialized);

bzero(&rreq, sizeof(rreq));

rtl = sizeof(struct rtmsg);

rta = (struct rtattr *)rreq.buf;

rta->rta_type = RTA_DST;
rta->rta_len = sizeof(struct rtattr) + 4;

printf("Copying prefix 0x%08x\n", prefix);

bcopy(&prefix,(char *)rta+rta->rta_len,4);

rtl += rta->rta_len;

rta = (struct rtattr *)(((char *)rta) + rta->rta_len);

rta->rta_type = RTA_OIF;
rta->rta_len = sizeof(struct rtattr) + 4;

printf("Copying OIF: %d\n", oif);
bcopy(&oif, (char *)rta+sizeof(struct rtattr), 4);

rtl += rta->rta_len;

/* Setup the NETLINK Header */

rreq.nl.nlmsg_len = NLMSG_LENGTH(rtl);
rreq.nl.nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE;
rreq.nl.nlmsg_type = RTM_NEWROUTE;

/* Setup operation header */

rreq.rt.rtm_family = AF_INET;
rreq.rt.rtm_dst_len = len;
rreq.rt.rtm_table = tbl_index;
rreq.rt.rtm_protocol = proto;
rreq.rt.rtm_scope = RT_SCOPE_UNIVERSE;
rreq.rt.rtm_type = rt_type;

bzero(&ra, sizeof(ra));
ra.nl_family = AF_NETLINK;

bzero(&msg, sizeof(msg));

msg.msg_name = (void *)&ra;
msg.msg_namelen = sizeof(ra);

iov.iov_base = (void *)&rreq.nl;
iov.iov_len = rreq.nl.nlmsg_len;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;

rtn = sendmsg(rtsock,&msg, 0);

if (rtn < 0)
{
printf("%s :", __FUNCTION__);
perror("sendmsg");
printf("\n");
return 0;
} else {
return 1;
}

return 0;

}

Here is the routing table before route add

[root@iLinux-Nilesh route]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.2.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
172.19.57.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.60.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 172.19.57.1 0.0.0.0 UG 0 0 0 eth0

Here is the sample run of the test program that uses this function.

Enter the route to be added: 172.21.1.1
Addr: 0xac150101
Enter prefix len: 32
len: 32
Enter the oif: 2
Copying prefix 0xac150101
Copying OIF: 2
ROUTE ADDED SUCCESSFULLY
[root@iLinux-Nilesh route]#

Routing table after route add program.

[root@iLinux-Nilesh route]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 <<<<<
10.2.2.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
172.19.57.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.60.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 172.19.57.1 0.0.0.0 UG 0 0 0 eth0
[root@iLinux-Nilesh route]#

Any idea what is wrong? Function in test program is invoked as below.

rc = rtm_add_v4(addr.s_addr,len,RT_TABLE_MAIN,oif,RTPROT_STATIC,RTN_UNICAST,NULL);

Never mind..found the

Nilesh's picture

Never mind..found the problem.

Hello Asanga... I am

Anonymous's picture

Hello Asanga...
I am newbie to Linux. As far as u have googled i found only your material for a sample.With my understanding on ur illustration I have made the below module to get the destination address and the gateway address when i give "route add -host 192.168.2.45 gw 202.34.2.1"
I get the gateway address as 192.168.2.45 from the module. but i expect the gateway to be 202.34.2.1...

Please help me on this regard.
Thanks in advance...

/* Read message from kernel */
recv(sock_fd,nlh,size, 0);
printf(" Received message payload: %s\n",
NLMSG_DATA(nlh));

rtp = (struct rtmsg*)NLMSG_DATA(nlh);

rtap =(struct rtattr*) RTM_RTA(rtp);
rtl = RTM_PAYLOAD(nlh);
for(;RTA_OK(rtap,rtl);rtap = RTA_NEXT(rtap,rtl))
{

switch(rtap_rta_type)
{
case RTA_GATEWAY:
inet_ntop(AF_INET,RTA_DATA(rtap),gws,24);
printf("\n gateway address is %s",gws);
break;

case RTA_DST:
inet_ntop(AF_INET,RTA_DATA(rtap),dsts,24);
printf("\n destination address is %s",dsts);
break;

case RTA_SRC:
printf("\n received source address");
break;

default:
break;
}
}

Getting the IPV6 Address of a Device via rtnetlink

saltorfer's picture

Hi, I currently have the problem that i want to get the IPV6 Address of a device via rtnetlink. Your article was already very helpful, but I still cannot find out, which fields of struct ifaddrmsg I have to fill out if I pass it with a request so that i get the IP that I am looking for. I set ifa_family to AF_INET6 and ifa_index to the device that I am looking at. Nevertheless when parsing the "answer" buffer so to speak I get NLMSG_ERROR and nothing else. Well and there is the problem that the programm never does more than one iteration in while(1) but it also never leaves it .. well that is a different problem I guess. Still, I would like to know why you put the second break condition in there, is it not always true? You didn't even set nl_groups in your programm ?
Sorry for my bad English but it has been a frustrating day full of debugging.

Greetings from Switzerland,
S.

Re. Getting the IPV6 Address of a Device via rtnetlink

Asanga's picture

Hello,

Here is the request init part of a sample RTNETLINK program that shows the IPv6 address info of interfaces.

bzero(&local, sizeof(local));
local.nl_family = AF_NETLINK;
local.nl_pid = getpid();
if(bind(fd, (struct sockaddr*) &local, sizeof(local)) < 0) {
printf("Error in sock bind\n");
exit(1);
}

bzero(&peer, sizeof(peer));
peer.nl_family = AF_NETLINK;

bzero(&msg_info, sizeof(msg_info));
msg_info.msg_name = (void *) &peer;
msg_info.msg_namelen = sizeof(peer);

bzero(&netlink_req, sizeof(netlink_req));

netlink_req.nlmsg_info.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifaddrmsg));
netlink_req.nlmsg_info.nlmsg_flags = NLM_F_REQUEST | NLM_F_DUMP;
netlink_req.nlmsg_info.nlmsg_type = RTM_GETADDR;
netlink_req.nlmsg_info.nlmsg_pid = getpid();

netlink_req.ifaddrmsg_info.ifa_family = AF_INET6;

iov_info.iov_base = (void *) &netlink_req.nlmsg_info;
iov_info.iov_len = netlink_req.nlmsg_info.nlmsg_len;
msg_info.msg_iov = &iov_info;
msg_info.msg_iovlen = 1;

rtn = sendmsg(fd, &msg_info, 0);

If you require the whole code, look in my home page.

Re. second break; Usually the end of a returned message is indicated by a NLMSG_DONE. But for monitoring of routing table changes, this will not work. Since the example code in this article was common, that second break is also part of the loop.

Kind regards,
Asanga

Getting route updates

Nagendra's picture

Hi..

This document was really helpfull. Thanks a lot.

I have a question regarding receiving route updates from the kernel.
I have a process that waits for any routing table changes. It is able to get updates when ever a new route is added or deleted. It gets arround 52 bytes of data.

When I add a new route entry I get 52 bytes but, it fails to enter
"for(;NLMSG_OK(nlp, nll);nlp=NLMSG_NEXT(nlp, nll))" loop of read_reply() as given in this document and more over when I try to print "nlp->nlmsg_type" its always RTM_NEWROUTE even though I deleted a route entry in my previous operation.

What I want is...
1) When ever a new entry gets added read_reply() function should print the new entry that got added.
2) When ever a entry is deleted from the route table, it should print the entry that got deleted as well as nlp->nlmsg_type shud be RTM_DELROUTE so that I know that the netlink message I got is because of delete operation.

Your help in this regard will be appreciated.

Thanks and regards,

Nagendra KS.

re: Getting route updates

Asanga's picture

Hello,

I added the statement

printf("Type %d\n", nlp->nlmsg_type);

just after the statement

nlp = (struct nlmsghdr *) buf;

in the mon_routing_table.c file and I see 25 (RTM_DELROUTE) for a route delete and 24 (RTM_ADDROUTE) for a route add.

Kind regards,
Asanga

Flush Cache

Mike CC's picture

It is possible to flush the route cache via rtnetlink sockets?

re: Flush Cache

Asanga's picture

Hello,

> It is possible to flush the route cache via rtnetlink sockets?

As far as I know, there isn't any RTNETLINK command to flush the routing cache. But after looking at the source code of the "ip" command suit I found that they write a -1 to
/proc/sys/net/ipv4/route/flush to flush the routing cache.

Kind regards,
Asanga

How to specify a NIC?

CC's picture

Hallo,
this article helps me to understand the way to implement a protocol.
But some questions are still confusing me.

If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?
Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?

Is there a Windows-Version of NETLINK & RTNETLINK ?

thanks in advance

Re: How to specify a NIC?

Asanga's picture

Hello,

>If an application just wants to send message through a specified NIC ( e.g. the node has more than one NIC, like LAN, WLAN etc), how can the application just set this selectivly ?

I assume that you are asking about sending IP packets over an interface. If that is the case, you must use INET type sockets to do this.

> Is it able to set up more than one NIC for sending/receiving at the same time, or it should be done in different threads ?

What interface a packet takes, is usually decided by the routing table, depending on the destination address of the packet. But I think INET sockets also has a facility to send packets from a given interface (thru sendmsg())

> Is there a Windows-Version of NETLINK & RTNETLINK ?

As far as I know, no (atleast up to XP)

Regards,
Asanga

Via gateway

Miguel's picture

I'm having problems adding a route via a gateway, i tried to add it but it simply wont work.

This is the code i'm adding after the iface

rtap= (struct rtattr *) (((char *)rtap) + sizeof(struct rtattr));
rtap->rta_type=RTA_GATEWAY;
rtap->rta_len = sizeof (struct rtattr) + 4;
inet_pton(AF_INET, gw, ((char *)rtap) + sizeof (struct rtattr));
rtl +=rtap->rta_len;

Thanks

Re: Via gateway

Asanga's picture

Hello Miguel,

Here is the code to add the gateway to a route,

rtap = (struct rtattr *) (((char *)rtap)
+ rtap->rta_len);
rtap->rta_type = RTA_GATEWAY;
rtap->rta_len = sizeof(struct rtattr) + 4;
inet_pton(AF_INET, gw,
((char *)rtap) + sizeof(struct rtattr));
rtl += rtap->rta_len;

Your code piece is almost the same except for the rta_len addition. If there is no problem here, also check whether you can add the same route entry that you are trying to add programatically using ip route add command. A frequent problem of adding gateways to routes is that the gateway should be reachable.

Kind regards,
Asanga

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState