Byte and Bit Order Dissection
The endianness of network protocols defines the order in which the bits and bytes of an integer field of a network protocol header are sent and received. We also introduce a term called wire address here. A lower wire address bit or byte always is transmitted and received in front of a higher wire address bit or byte.
In fact, for network endianness, it is a little different than what we have seen so far. Another factor is in the picture: the bit transmission/reception order on the physical wire. Lower layer protocols, such as Ethernet, have specifications for bit transmission/reception order, and sometimes it can be the reverse of the upper layer protocol endianness. We look at this situation in our examples.
The endianness of NIC devices usually follow the endianness of the network protocols they support, so it could be different from the endianness of the CPU on the system. Most network protocols are big endian; here we take Ethernet and IP as examples.
Ethernet is big endian. This means the most significant byte of an integer field is placed at a lower wire byte address and transmitted/received in front of the least significant byte. For example, the protocol field with a value of 0x0806(ARP) in the Ethernet header has a wire layout like this:
wire byte offset: 0 1 hex : 08 06
Notice that the MAC address field of the Ethernet header is considered as a string of characters, in which case the byte order does not matter. For example, a MAC address 12:34:56:78:9a:bc has a layout on the wire like that shown below, and byte 12 is transmitted first.
The bit transmission/reception order specifies how the bits within a byte are transmitted/received on the wire. For Ethernet, the order is from the least significant bit (lower wire address offset) to the most significant bit (higher wire address offset). This apparently is little endian. The byte order remains the same as big endian, as described in early section. Therefore, here we see the situation where the byte order and the bit transmission/reception order are the reverse.
The following is an illustration of Ethernet bit transmission/reception order:
We see from this that the group (multicast) bit, the least significant bit of the first byte, appeared as the first bit on the wire. Ethernet and 802.3 hardware behave consistently with the bit transmission/reception order above.
In this case, where the protocol byte order and the bit transmission/reception order are different, the NIC must convert the bit transmission/reception order from/to the host(CPU) bit order. By doing so, the upper layers do not have to worry about bit order and need only to sort out the byte order. In fact, this is another form of the Byte Consistent approach, where byte semantics are preserved when data travels across different endian domains.
The bit transmission/reception order generally is invisible to the CPU and software, but is important to hardware considerations such as the serdes (serializer/deserializer) of PHY and the wiring of NIC device data lines to the bus.
For either endianness, the Ethernet header can be parsed by software with the C structure below:
struct ethhdr
{
unsigned char h_dest[ETH_ALEN];
unsigned char h_source[ETH_ALEN];
unsigned short h_proto;
};
The h_dest and h_source fields are byte arrays, so no conversion is needed. The h_proto field here is an integer, therefore a ntohs() is needed before the host accesses this field, and htons() is needed before the host fills up this field.
IP's byte order also is big endian. The bit endianness of IP inherits that of the CPU, and the NIC takes care of converting it from/to the bit transmission/reception order on the wire.
For big endian hosts, IP header fields can be accessed directly. For little endian hosts, which are most PCs in the world (x86), byte swap needs to be be performed in software for the integer fields in the IP header.
Below is the structure of iphdr from the Linux kernel. We use ntohs() before reading integer fields and htons() before writing them. Essentially, these two functions do nothing for big endian hosts and perform byte swapping for little endian hosts.
struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
__u8 ihl:4,
version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
__u8 version:4,
ihl:4;
#else
#error "Please fix <asm/byteorder.h>"
#endif
__u8 tos;
__u16 tot_len;
__u16 id;
__u16 frag_off;
__u8 ttl;
__u8 protocol;
__u16 check;
__u32 saddr;
__u32 daddr;
/*The options start here. */
};
Take a look at some interesting fields in the IP header:
version and ihl fields: According to IP standard, version is the most significant four bits of the first byte of an IP header. ihl is the least significant four bits of the first byte of the IP header.
There are two methods to access these fields. Method 1 directly extracts them from the data. If ver_ihl holds the first byte of the IP header, then (ver_ihl & 0x0f) gives the ihl field and (ver_ihl > > 4) gives the version field. This applies for hosts with either endianness.
Method 2 is to define the structure as above, then access these fields from the structure itself. In the above structure, if the host is little endian, then we define ihl before version; if the host is big endian, we define version before ihl. If we apply Kevin's Theory #2 here that an earlier defined field always occupies a lower memory address, we find that the above definition in C structure fits the IP standard pretty well.
saddr and daddr fields: these two fields can be treated as either byte or integer arrays. If they are treated as byte arrays, there is no need to do endianness conversion. If they are treated as integers, then conversions need to be performed as needed. Below is a function with integer interpretation:
/* dot2ip - convert a dotted decimal string into an
* IP address
*/
uint32_t dot2ip(char *pdot)
{
uint32_t i,my_ip;
my_ip=0;
for (i=0; i<IP_ALEN; ++i) {
my_ip = my_ip*256+atoi(pdot);
if ((pdot = (char *) index(pdot, '.')) == NULL)
break;
++pdot;
}
return my_ip;
}
And here is the function with byte array interpretation:
uint32_t dot2ip2(char *pdot)
{
int i;
uint8_t ip[IP_ALEN];
for (i=0; i<IP_ALEN; ++i) {
ip[i] = atoi(pdot);
if ((pdot = (char *) index(pdot, '.')) == NULL)
break;
++pdot;
}
return *((uint32_t *)ip);
}
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- New Products
- The Pari Package On Linux
- Trying to Tame the Tablet
- Paranoid Penguin - Security Features in Ubuntu
- What's the tweeting protocol?
- This is the easiest tutorial
1 hour 54 sec ago - Ahh, the Koolaid.
6 hours 39 min ago - git-annex assistant
12 hours 39 min ago - direct cable connection
13 hours 1 min ago - Agreed on AirDroid. With my
13 hours 11 min ago - I just learned this
13 hours 15 min ago - enterprise
13 hours 46 min ago - not living upto the mobile revolution
16 hours 37 min ago - Deceptive Advertising and
17 hours 12 min ago - Let\'s declare that you have
17 hours 13 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.





Comments
Memory bit order of int
On a system with little endian and int holding 32 bits. If one want to bit twiddle and set bits by memory order, would this be correct?
* To set bit 1,2,8,29 of int my_int in memory, in c, one would set bit:
If so, is there any trick to calculate the left-shift from a given value, i.e. "Set bit 29" - give 26.
As the left-shift order would be:
have tried with various ((set_bit%BITS_PER_INT / CHAR_BIT) * CHAR_BIT) + ...; variants but always become a big huge pile of mods and so on.
Any good trick for this?
(I'm perhaps thinking wrong here, - starting to get to late)
Bit order?
What is bit order? In which machines do bits have addresses?
Ethernet Address Endianess
I think your description of Ethernet Addressing is mistaken. In your example where the MAC address 12:34:56:78:9a:bc, you say that "12" will appear on the line first. This is not correct. The "bc" will appear first. Refer to section 3.2 of the 802.3 spec. It explicitly states the byte ordering of the Length field and the CRC are high-order byte first. So, I'm led to believe that the SA and DA are low-order byte first.
This would make sense because we know that the first bit on the wire determines multicast or unicast and that this is the LSB of the entire field...which is the last byte (not of the 1st byte).
Void
Nevermind my previous posting. It was a late night.
Errata: dot2ip() function
It's incomplete in this on-line version which should be:
/* dot2ip - convert a dotted decimal string into an
* IP address
*/
uint32_t dot2ip(char *pdot)
{
uint32_t i,my_ip;
my_ip=0;
for (i=0; i
my_ip = my_ip*256+atoi(pdot);
if ((pdot = (char *) index(pdot, '.')) == NULL)
break;
++pdot;
}
return my_ip;
}
Re: Errata: dot2ip() function
Already fixed by LJ, thanks!
Errata for ASCII Graphs
Most of the ASCII graphs inlined in this on-line version of
the article are not formatted properly.
I'm contacting LJ to correct the format. In the meanwhile
you can reference my original article here if you get
confused of the ASCII graphs:
http://www.employees.org/~hek2000/articles/endianess-v0.7.html
Re: Errata for ASCII Graphs
Already fixed. by LJ. Thanks!
Re: Byte and Bit Order Dissection
1. A great article;
2. I suggest you create a HOWTO in the Linux Documentation Project (www.tldp.org) so that more people can benefit from your article;
3. As I know, bit0 is the MSB in Motorola PowerPC Manual; maybe you should clarify your bit numbering explicitly;
Re: Byte and Bit Order Dissection
Thank you!
I'll consider the HOWTO suggestion.
About the 3rd comment, have you seen the "Typo"
discussion thread other readers brought up ?
Hopefully my correction to the typo can address your
doubt too.
- kevin
Typo?!
"That is, in a big endian system the most significant bit is stored at the lowest bit address; in a little endian system, the least significant bit is stored at the lowest bit address."
Re: Typo?! -- -Yes, it's an error
In fact, it is an error. In the original article I submitted
to LJ , I wrote:
"That is, in a big endian system, the most significant
bit is stored at the lowest bit address and in a
little endian system, the least significant bit is
stored at the lowest bit address." ---- Correct
^^^^^^
But somehow it was changed to the following
in the on-line version without notifying me.
"That is, in a big endian system, the most significant
bit is stored at the lowest bit address and in a
little endian system, the least significant bit is
stored at the highest bit address." --- Wrong
^^^^^^^^
I'm contacting LJ to correct this error now,
in the meanwhile please reference my original
sentence.
Thanks,
Kevin
Big and Little Endians
Thank you for the pow wow concerning big endians and little endians. One thing is clear, although there are several kinds of endians, there are neither good endians nor bad endians. It would be nice to have but one type of endian, but uniting all endian tribes of thought under one teepee is not likely for the forseeable future. Nevertheless, it would be nice to hold a big council, so let me know when and where, and I'll make a reservation to attend.
Re: Typo?!
The sentence you quoted follows "Bit order usually follows the same endianness as the byte order for a given computer system. ".
So I'm illustrating what the bit order will look like if it follows
the byte order on the same architecture. In another
word, in some systems where bit order doesn't follow
byte order, the quoted sentence is not applicable.
Thanks,
Kevin
Re: Typo?!
No not really. What is meant is that in big-endian, bit 0 is the most significant bit and in little-endian, bit 7 is the most significant bit (for a single byte).
Example:
In most RISC architectures a 64-bit bus would be represented as 64bus
In an Intel system a 64-bit bus would be represented as 64bus
Re: Byte and Bit Order Dissection
You left out everyone's favorite forgotten case: Middle endian! And a mention of the origin of "endian" (we have the Lilliputians to thank for this).
Seriously, though -- good article.
Re: Byte and Bit Order Dissection
Thank you for the input. It must be more complete to include
"Middle Endian" in the discussion.
On the other hand, I have a word count limitation for the article
which forces me include only the most typical cases :p
Kevin
Re: Byte and Bit Order Dissection
No, I was kidding about Middle Endian. It's an obsolete format (or rather, _they're_ obsolete formats). But no byte order discussion is complete without a mention of "Gulliver's Travels". Right after "First introduced by Danny Cohen in 1980, it describes the method a computer system uses to represent multi-byte integers." should be something like, "This was a reference to the disagreement about which side of an egg was the proper side to crack first."