od—The Oddest Text Utility Around
Suppose you are writing the next great spreadsheet for Linux, and you're actually getting along pretty well. You have a program that can edit cells, format the screen, and do all the really good spreadsheet stuff. And you can even save the sheets in a user-specified file. But then you make a change to the format of the file, and you realize you need to examine the file, byte by byte, in order to determine what went wrong with that last change. You know that Emacs can show you the file, but you can't remember exactly how to get into hexadecimal mode, or what to do once you are there.
Or suppose you are writing a viewer program for your favorite word-processor, which runs only under your second favorite operating system (WINE and DOSEMU notwithstanding). So you need to figure out exactly what each binary code in that .wpd file really is, so that you can determine what each binary code does by trial and error. (What a trial and error process that would be.)
Or maybe you are curious about exactly what escape sequence is sent to your terminal when a curses program positions the cursor. (Maybe this example is a bit contrived, but it's interesting nonetheless.)
If any of these scenarios describes your current dilemma, then od is just the utility for you. od stands for Octal Dump, because it was named before computer users started using hexadecimal for everything, and because it can dump a file (binary or not) into almost any form you can imagine.
So let's see what can be done with od. The easiest thing to try is to get an octal dump of od itself. Listing 1 shows the first 6 lines of output of the command od `which od`. There are several things to notice about this example. (Note: I'm using an older a.out version of od, so this is might not be exactly what you see on your system.) By way of explanation, the first column is the “offset” into the file, and the remaining columns are the actual data in the file.
There are three things to notice about this listing. First, all the numbers are in octal, or base 8. I'm not aware of anyone using octal notation for anything anymore. And of course, with the GNU version of od, there are options for changing how everything gets displayed—more on that later.
Second, all the numbers are 16 bits wide. Since Linux is a 32-bit operating system, this is probably not what you want. Again, there are ways of modifying this behavior.
Third, the third line of output contains a single *. This is od's way of saying that there are many lines just like the previous line, which have been removed from the output. It then continues the output at offset (octal) 2000, which is the first line that differs from the previous line. (Can you guess that this behavior can also be modified? It can.)
As mentioned earlier, od has many options for formatting the output. The first one to mention is -t xS or -t xL, which will cause the output to be in hexadecimal (base 16). The S or L modifier tells od to read 16 bits (S) or 32 bits (L) at a time. To all you C programmers, yes, those modifiers stand for “short” and “long.” There are other modifiers as well, and good descriptions for them can be found in the man page for od. Listing 2 shows the first six lines of output of the command:
od -t xS `which od`
od can also output the characters of the file. And if you want to do some comparisons, you can intersperse the hexadecimal output with the character output. Just give both types on the command line (see Listing 3) as:
od -t xS -t c `which od`
There are a couple of things to note about this example. The character-type arguments don't take a size modifier—they just read one character at a time. That's why we used -t c and not -t cS.
Also, the ordering of the character data looks strange. The first 4 bytes in the hexadecimal dump are 010b 0064, while the first 4 bytes in the character dump are \v 001 d \0. This is because my Linux machine runs on an Intel-based chip set, which is a little endian architecture. Other architectures will print this differently. In fact, this is the easiest way I know to determine whether the machine you are running on is big-endian or little-endian. The actual command to determine this would be something like:
echo abcd | od -t xS
A little-endian machine would output:
0000000 6261 6463 000awhile a big-endian machine would output:
0000000 6162 6364 0a00I haven't actually seen Linux on a SPARC or a DEC Alpha chip; I would guess these Linux systems would be big-endian.
Let's get back to the last example. Notice that the character output of the last example has a lot of backslashes in it. This is one method od uses to show that the character it is trying to print is really not a printable character. Another method is to show the character in octal. Examples of the first method are \v and \0 and (at offset 2024). Examples of the second method are 001 and 315 (at offsets 0001 and 2017 respectively). (Offsets are still in hexadecimal—we're getting to that problem.)
If you really hate octal, and want to see the offsets in a different base, od allows that. The option is -A x to see the offsets in hexadecimal, or -A d to show the offsets in decimal. (Enough of showing listings of these commands—just do it.)
You might have noticed that od always shows 16 bytes per line. Of course, you can change this as well, by using the -w option. The argument after the -w flag is the number of bytes to read before outputting a line of text. The default without the -w flag is 16 (as you can see from all the examples). The default with the -w flag (i.e. -w by itself) is 32. Unfortunately, I couldn't get this option to work on my machine. Every number I gave (-w20, -w18, -w16) caused od to report “invalid width specification.” (I'm using GNU textutils version 1.9, for what it's worth.)
Sometimes you want to see the whole file, and not repress any output. The -v option tells od to not skip any lines, and to output everything. This can be useful if you need to compare two different binary files, and you want to compare the actual bytes in the files, without skipping any of the output.
Finally, all of these options have a long format, as is standard with GNU utilities. For example, the -v switch can be expanded to --output-duplicates. I tend to use the long form in scripts, so it is clear to others exactly what options I'm sending to the program, and the short forms when I'm just working.
So, how exactly do you see the escape sequence sent to your terminal when a curses program positions the cursor? Try the command:
tput cup 10 10 | od -t c
Randy Zack can be reached via e-mail at email@example.com.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- RSS Feeds
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Designing Electronics with Linux
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- What's the tweeting protocol?
- Kernel Problem
3 hours 2 min ago
- BASH script to log IPs on public web server
7 hours 29 min ago
11 hours 4 min ago
- Reply to comment | Linux Journal
11 hours 37 min ago
- All the articles you talked
14 hours 54 sec ago
- All the articles you talked
14 hours 4 min ago
- All the articles you talked
14 hours 5 min ago
18 hours 30 min ago
- Keeping track of IP address
20 hours 21 min ago
- Roll your own dynamic dns
1 day 1 hour ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?