Byte and Bit Order Dissection
Editors' Note: This article has been updated since its original posting.
Software and hardware engineers who have to deal with byte and bit order issues know the process is like walking a maze. Though we usually come out of it, we consume a handful of our brain cells each time. This article tries to summarize the various areas in which the business of byte and bit order plays a role, including CPU, buses, devices and networking protocols. We dive into the details and hope to provide a good reference on this topic. The article also tries to suggest some guidelines and rules of thumb developed from practice.
We probably are familiar with the word endianness. First introduced by Danny Cohen in 1980, it describes the method a computer system uses to represent multi-byte integers.
Two types of endianness exist, big endian and little endian. Big endian refers to the method that stores the most significant byte of an integer at the lowest byte address. Little endian is the opposite; it refers to the method of storing the most significant byte of an integer at the highest byte address.
Bit order usually follows the same endianness as the byte order for a given computer system. That is, in a big endian system the most significant bit is stored at the lowest bit address; in a little endian system, the least significant bit is stored at the lowest bit address.
Every effort is made to avoid bit swapping in software when designing a system, because bit swapping is both expensive and tedious. Later sections describe how hardware takes care of it.
Just as most people write a number from left to right, the layout of a multi-byte integer should flow from left to right, that is, from the most significant to the least significant byte. This is the most clear way to write integers, as we can see in the following examples.
Here is how we would write the integer 0x0a0b0c0d for both big endian and little endian systems, according to the rule above:
Write Integer for Big Endian System
byte addr 0 1 2 3
bit offset 01234567 01234567 01234567 01234567
binary 00001010 00001011 00001100 00001101
hex 0a 0b 0c 0d
Write Integer for Little Endian System
byte addr 3 2 1 0
bit offset 76543210 76543210 76543210 76543210
binary 00001010 00001011 00001100 00001101
hex 0a 0b 0c 0d
In both cases above, we can read from left to right and the number is 0x0a0b0c0d.
If we do not follow the rule, we might write the number in the following way:
byte addr 0 1 2 3
bit offset 01234567 01234567 01234567 01234567
binary 10110000 00110000 11010000 01010000
As you can see, it's hard to make out what number we're trying to represent.
Without losing generality, a simplified view of the computer system discussed in this article is drawn below.

CPU, local bus and internal memory/cache all are considered to be CPU, because they usually share the same endianness. Discussion of bus endianness, however, covers only external bus. The CPU register width, memory word width and bus width are assumed to be 32 bits for this article.
The CPU endianness is the byte and bit order in which it interprets multi-byte integers from on-chip registers, local bus, in-line cache, memory and so on.
Little endian CPUs include Intel and DEC. Big endian CPUs include Motorola 680x0, Sun Sparc and IBM (e.g., PowerPC). MIPs and ARM can be configured either way.
The CPU endianness affects the CPU's instruction set. Different GNU C toolchains for compiling the C code ought to be used for CPUs of different endianness. For example, mips-linux-gcc and mipsel-linux-gcc are used to compile MIPs code for big endian and little endian, respectively.
The CPU endianness also has an impact on software programs if we need to access part of a multi-byte integer. The following program illustrates that situation. If one accesses the whole 32-bit integer, the CPU endianness is invisible to software programs.
union {
uint32_t my_int;
uint8_t my_bytes[4];
} endian_tester;
endian_tester et;
et.my_int = 0x0a0b0c0d;
if(et.my_bytes[0] == 0x0a )
printf( "I'm on a big-endian system\n" );
else
printf( "I'm on a little-endian system\n" );
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- RSS Feeds
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Readers' Choice Awards 2011
- Home, My Backup Data Center
- What's the tweeting protocol?
- Linux on Azure—a Strange Place to Find a Penguin
- Running Ubuntu as a Virtual OS in Mac OS X
- Reply to comment | Linux Journal
3 hours 57 min ago - Reply to comment | Linux Journal
6 hours 30 min ago - Reply to comment | Linux Journal
7 hours 47 min ago - great post
8 hours 22 min ago - Google Docs
8 hours 44 min ago - Reply to comment | Linux Journal
13 hours 33 min ago - Reply to comment | Linux Journal
14 hours 20 min ago - Web Hosting IQ
15 hours 54 min ago - Thanks for taking the time to
17 hours 30 min ago - Linux is good
19 hours 28 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Memory bit order of int
On a system with little endian and int holding 32 bits. If one want to bit twiddle and set bits by memory order, would this be correct?
* To set bit 1,2,8,29 of int my_int in memory, in c, one would set bit:
If so, is there any trick to calculate the left-shift from a given value, i.e. "Set bit 29" - give 26.
As the left-shift order would be:
have tried with various ((set_bit%BITS_PER_INT / CHAR_BIT) * CHAR_BIT) + ...; variants but always become a big huge pile of mods and so on.
Any good trick for this?
(I'm perhaps thinking wrong here, - starting to get to late)
Bit order?
What is bit order? In which machines do bits have addresses?
Ethernet Address Endianess
I think your description of Ethernet Addressing is mistaken. In your example where the MAC address 12:34:56:78:9a:bc, you say that "12" will appear on the line first. This is not correct. The "bc" will appear first. Refer to section 3.2 of the 802.3 spec. It explicitly states the byte ordering of the Length field and the CRC are high-order byte first. So, I'm led to believe that the SA and DA are low-order byte first.
This would make sense because we know that the first bit on the wire determines multicast or unicast and that this is the LSB of the entire field...which is the last byte (not of the 1st byte).
Void
Nevermind my previous posting. It was a late night.
Errata: dot2ip() function
It's incomplete in this on-line version which should be:
/* dot2ip - convert a dotted decimal string into an
* IP address
*/
uint32_t dot2ip(char *pdot)
{
uint32_t i,my_ip;
my_ip=0;
for (i=0; i
my_ip = my_ip*256+atoi(pdot);
if ((pdot = (char *) index(pdot, '.')) == NULL)
break;
++pdot;
}
return my_ip;
}
Re: Errata: dot2ip() function
Already fixed by LJ, thanks!
Errata for ASCII Graphs
Most of the ASCII graphs inlined in this on-line version of
the article are not formatted properly.
I'm contacting LJ to correct the format. In the meanwhile
you can reference my original article here if you get
confused of the ASCII graphs:
http://www.employees.org/~hek2000/articles/endianess-v0.7.html
Re: Errata for ASCII Graphs
Already fixed. by LJ. Thanks!
Re: Byte and Bit Order Dissection
1. A great article;
2. I suggest you create a HOWTO in the Linux Documentation Project (www.tldp.org) so that more people can benefit from your article;
3. As I know, bit0 is the MSB in Motorola PowerPC Manual; maybe you should clarify your bit numbering explicitly;
Re: Byte and Bit Order Dissection
Thank you!
I'll consider the HOWTO suggestion.
About the 3rd comment, have you seen the "Typo"
discussion thread other readers brought up ?
Hopefully my correction to the typo can address your
doubt too.
- kevin
Typo?!
"That is, in a big endian system the most significant bit is stored at the lowest bit address; in a little endian system, the least significant bit is stored at the lowest bit address."
Re: Typo?! -- -Yes, it's an error
In fact, it is an error. In the original article I submitted
to LJ , I wrote:
"That is, in a big endian system, the most significant
bit is stored at the lowest bit address and in a
little endian system, the least significant bit is
stored at the lowest bit address." ---- Correct
^^^^^^
But somehow it was changed to the following
in the on-line version without notifying me.
"That is, in a big endian system, the most significant
bit is stored at the lowest bit address and in a
little endian system, the least significant bit is
stored at the highest bit address." --- Wrong
^^^^^^^^
I'm contacting LJ to correct this error now,
in the meanwhile please reference my original
sentence.
Thanks,
Kevin
Big and Little Endians
Thank you for the pow wow concerning big endians and little endians. One thing is clear, although there are several kinds of endians, there are neither good endians nor bad endians. It would be nice to have but one type of endian, but uniting all endian tribes of thought under one teepee is not likely for the forseeable future. Nevertheless, it would be nice to hold a big council, so let me know when and where, and I'll make a reservation to attend.
Re: Typo?!
The sentence you quoted follows "Bit order usually follows the same endianness as the byte order for a given computer system. ".
So I'm illustrating what the bit order will look like if it follows
the byte order on the same architecture. In another
word, in some systems where bit order doesn't follow
byte order, the quoted sentence is not applicable.
Thanks,
Kevin
Re: Typo?!
No not really. What is meant is that in big-endian, bit 0 is the most significant bit and in little-endian, bit 7 is the most significant bit (for a single byte).
Example:
In most RISC architectures a 64-bit bus would be represented as 64bus
In an Intel system a 64-bit bus would be represented as 64bus
Re: Byte and Bit Order Dissection
You left out everyone's favorite forgotten case: Middle endian! And a mention of the origin of "endian" (we have the Lilliputians to thank for this).
Seriously, though -- good article.
Re: Byte and Bit Order Dissection
Thank you for the input. It must be more complete to include
"Middle Endian" in the discussion.
On the other hand, I have a word count limitation for the article
which forces me include only the most typical cases :p
Kevin
Re: Byte and Bit Order Dissection
No, I was kidding about Middle Endian. It's an obsolete format (or rather, _they're_ obsolete formats). But no byte order discussion is complete without a mention of "Gulliver's Travels". Right after "First introduced by Danny Cohen in 1980, it describes the method a computer system uses to represent multi-byte integers." should be something like, "This was a reference to the disagreement about which side of an egg was the proper side to crack first."