Work the Shell - Handling Errors and Making Scripts Bulletproof

Shell scripts may be quick, easy and lightweight, but proper scripting includes the ability to anticipate and respond to error situations gracefully and without anything breaking. Dave explores some of the basic shell script error-handling options.

I realize I've been playing a bit fast and loose with my shell scripts over the last few months, because I haven't talked about how to ensure that error conditions don't break things. If you read the Letters section in Linux Journal, you know I haven't covered this topic because, well, you have covered it for me!

This topic ranges from the simple to the sophisticated, so let's start with a basic test: the return status after an application or utility is invoked.

The Magical $? Sequence

Different shells have different return status indicators (the C shell, for example, uses $status), but the most basic is Bash/the Bourne shell, which is what we've focused on since I started writing Work the Shell, and it uses $?.

Here's a quick example:


mkdir /
echo "return status is $?"

mkdir /tmp/foobar
echo "return status is $?"

rmdir /tmp/foobar
echo "return status is $?"

rmdir /tmp
echo "return status is $?"

exit 0

Run this, and you can see the difference between commands that succeed and those that fail:

mkdir: /: Is a directory
return status is 1
return status is 0
return status is 0
rmdir: /tmp: Not a directory
return status is 1

You can see that when invoking mkdir or rmdir with an error condition, they output an error and—the important part—the $? return status is nonzero.

In fact, check out the man page for a typical command like mkdir, and you'll see: “DIAGNOSTICS: The mkdir utility exits 0 on success, and >0 if an error occurs.”

In a perfect world, the >0 return code would actually tell you what happened, but although that's true with the functions accessible via software, it's not true for the shell.

On the other hand, it's still helpful to explore how to make a shell function that does error handling too. Here's a basic example function:

   mkdir $1

   echo "return status is $status"

This just makes a simple function that calls mkdir, and it should be no surprise that it works as follows if I invoke it three times—twice in error situations and once without an error:

mkdir: /: Is a directory
return status is 1
mkdir: /tmp/foobar: File exists
return status is 1

It's a drag to have mkdir generate an error message when you can produce your own simply by testing the $? status variable.

Here's how you can do just that:

   mkdir $1 2>&1 > /dev/null

   echo "makedirectory failed trying to make $1 (error $status)"

This is a bit tricky to understand, because you have to suppress the error message from mkdir so you can generate your own. That's done by redirecting standard error to standard out (the 2>&1 sequence) and then redirect standard output to /dev/null (the > /dev/null sequence).

Tip: there's a shorthand you could use here too, if you wanted to be a bit more cryptic: &>/dev/null.

Now when running this, however, the output is far more sophisticated:

makedirectory failed trying to make / (error 1)
makedirectory failed trying to make /tmp/foobar (error 1)

That's a nice way to deal with errors, and of course, the function can also return the error code, with return $status as the last line.

Using test to Avoid Error Conditions

The best way to handle errors is to capture error conditions beforehand. This is best done with the wonderful and powerful test command. For example, the two typical error conditions that you'd encounter with the makedirectory() function are the directory already existing or the script not having permission to create the directory.

The first is pretty easy to test:

if [ -d "$1" ] ; then
  echo "Error: directory $1 already exists."
  exit 0

The second is a bit trickier because you need to grab the parent directory portion of the requested directory then test it to see whether you have write and execute permission to create the subdirectory. This can be done with the dirname function (which returns . if there's no explicit directory given), followed by a test for -w for writeable and -x for executable.

It all combines like this:

parentdir="$(dirname $1)" 
if [ ! -x $parentdir -o ! -w $parentdir ] 
  echo "Uh oh, can't create requested directory $1"
  exit 0

This is a sophisticated use of the test command, but read “!” as “not” and “-o” as “or”, and you can see the test is “if not executable $parentdir or not writeable $parentdir then...”, and that should make sense!


Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot