Work the Shell - Understanding Exit Codes
Last month, we looked at signals, the rudimentary mechanism that processes on a Linux box can use to communicate events and state changes. We talked about how each of the signals can be sent manually to a running process with the kill command, and how shell scripts then can catch and respond to specific signals (though not all of them—some cannot be caught because they're actually handled by the operating system itself).
Analogous to signals, exit codes turn out to be an easy way for processes to communicate state back to the calling parent process, in a way that most Linux users just ignore. Not anymore—this month, we're going to take a closer look.
Let's start with a simple Linux command that everyone's probably already mastered: mv, which moves a file or directory from one spot in the filesystem to another (and/or renames it).
As you know, you can generate errors if the target is missing, destination is missing and so on. Here's quick example:
$ mv missing ~/missing2 mv: cannot move `missing' to `.../missing2': No such file or directory
You see an error; obviously, it didn't work. Ah, but behind the scenes, a numeric “return code” variable has been set in the shell too, something you can test and respond to within a shell script. Check out this sequence:
$ echo $? 0 $ mv missing ~/missing2 mv: cannot move `missing' to `.../missing2': No such file or directory $ echo $? 1 $ echo test me test me $ echo $? 0
If no error occurs when executing a command, the exit code (which we reference in the shell with the shorthand $?) will have the value of 0: no error. Now, if I run a command that fails, the exit code will have a nonzero value. In the case of the failing mv command above, the error code will have the value of 1. And, if I now run yet another command, one which runs without error, the error code is reset to zero.
Now, let's take a peek at the mv man page, paying particular attention to the latter part of the doc. Close examination reveals: “The mv utility exits 0 on success, and >0 if an error occurs.”
That's not so interesting, really. The grep command has more interesting diagnostics, actually: “Normally, exit status is 0 if selected lines are found and 1 otherwise. But the exit status is 2 if an error occurred, unless the -q or --quiet or --silent option is used and a selected line is found.”
There is a set of system exit codes that are defined, although it's possible you'll never need the information. Here's a list of the codes and their meanings:
1: general errors
2: misuse of shell builtins (pretty rare)
126: cannot invoke requested command
127: command not found error
128: invalid argument to “exit”
128+n: fatal error signal “n” (for example, kill -9 = 137).
130: script terminated by Ctrl-C
I'd never actually seen this list until I started digging into the issue of exit codes, so you can continue on your merry shell-scripting path safely without worrying about the data above.
The most common situation in which you analyze and respond to an exit code is in error handling in a script.
Here's a simple snippet where you want to create a directory. If it fails, you want to output an error message and quit:
#!/bin/sh mkdir /usr echo \$? = $? if [ $? -ne 0 ] ; then echo "mkdir /usr failed: we have an exit code of $?" exit 1 fi echo "made the requested directory. Why is '/' world writable?" exit 0
It turns out, there's a nuance to working with the $? that I've already alluded to—one that makes output statements like the first “echo” quite problematic. You can see why in the output:
$ ./test.sh mkdir: /usr: File exists $? = 1 made the requested directory. Why is '/' world writable?
Can you see what happened? The exit code = 1 immediately after the mkdir, which makes sense as /usr already exists, but when we again test the exit code in the conditional, it's not a nonzero value!
Why? Because at that point, it indicates the exit code of the echo statement, not the mkdir command. Oops.
You can verify this simply by commenting out the first echo statement, in which case you now see this as the command output:
$ !. ./test.sh mkdir: /usr: File exists mkdir /usr failed: we have an exit code of 0
That makes more sense, doesn't it?
Because this can be tricky, a common thing I see in really bulletproof shell scripts with lots of error handling is something like this:
#!/bin/sh mkdir /usr error=$? if [ $error -ne 0 ] ; then echo "mkdir /usr failed: we have an exit code of $error" exit 1 fi
This is one instance where a local variable to hold a system or global variable makes a lot of sense, and it also lets you do things like have an error message show up on-screen and be handed off to something like syslog() to ensure that the admin sees it at some point.
Of course, error handling doesn't always just need to print a message and exit. Another scenario might be the following:
alternates=' http://www.example.com/test.pdf http://www.example2.com/test.pdf http://www.example3.com/test.pdf ' gotit=0 for file in $alternates do wget $file if [ $? -ne 0 ]; then echo "Unable to get $file else gotit=1 break fi done ...
Here, we try to retrieve a file from one of multiple alternate locations and exit the loop only when we succeed (or when we've run out of possibilities).
Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SourceClear Open
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Tech Tip: Really Simple HTTP Server with Python
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
- Managing Linux Using Puppet
- Returning Values from Bash Functions
- SuperTuxKart 0.9.2 Released
- Non-Linux FOSS: Caffeine!
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide