Variable Mangling in Bash with String Operators

March 13th, 2006 by Pat Eyler in

Here's a quick and updated HOWTO for using string operators in bash to manipulate variables.
Your rating: None Average: 4.4 (17 votes)

Editor's Note: This article has been updated by its author. Thank you, Pat.

Have you ever wanted to change the names of many files at once? Or, have you ever needed to use a default value for a variable that has no value? These and many other options are available to you when you use string operators in bash and other Bourne-derived shells.

String operators allow you to manipulate the contents of a variable without having to write your own shell functions to do so. They are provided through "curly brace" syntax. Any variable can be displayed as ${foo} without changing its meaning. This functionality often is used to protect a variable name from surrounding characters.

$ export foo=foo 
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't

$

By the end of this article, you'll be able to use it for a whole lot more.

Three kinds of variable substitution are available for use: pattern matching, substitution and command substitution. I talk about the first two variables here and leave command substitution for another time.

Pattern Matching

In pattern matching, you can match from the left or from the right. The operators, along with their functions and examples, are shown below:

Operator: ${foo#t*is}

Function: deletes the shortest possible match from the left

Example:

$ export foo="this is a test"
$ echo ${foo#t*is}
is a test
$

Operator: ${foo##t*is}

Function: deletes the longest possible match from the left

Example:

$ export foo="this is a test"
$ echo ${foo##t*is}
a test
$

Operator: ${foo%t*st}

Function: deletes the shortest possible match from the right

Example:

$ export foo="this is a test"
$ echo ${foo%t*st}
this is a
$

Operator: ${foo%%t*st}

Function: deletes the longest possible match from the right

Example:

$ export foo="this is a test"
$ echo ${foo%%t*st}

$

Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left. The % key is on the right of the $ key and operates from the right. (This is true, at least, for US qwerty keyboards.)

The operators listed above can be used to do a variety of things. For example, the following script changes the extension of all .html files so they now are .htm files.

#!/bin/bash 
# quickly convert html filenames for use on a dossy system
# only handles file extensions, not file names
for i in *.html; do
   if [ -f ${i%l} ]; then 
      echo "${i%l} already exists"
   else 
      mv $i ${i%l}
   fi 
done
Substitution

Another kind of variable mangling you might want to employ is substitution. Four substitution operators are used in Bash, and they are shown below:

Operator: ${foo:-bar}

Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, return bar.

Example:

$ export foo=""
$ echo ${foo:-one}
one
$ echo $foo

$

Operator: ${foo:=bar}

Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, set $foo to bar and return bar.

Example:

$ export foo=""
$ echo ${foo:=one}
one

$ echo $foo
one
$

Operator: ${foo:+bar}

Function: If $foo exists and is not null, return bar. If it doesn't exist or is null, return a null.

Example:

$ export foo="this is a test"
$ echo ${foo:+bar}
bar
$

Operator: ${foo:?"error message"}

Function: If $foo exists and isn't null, return its value. If it doesn't exist or is null, print the error message. If no error message is given, it prints parameter null or not set. In a non-interactive shell, this aborts the current script. In an interactive shell, this simply prints the error message.

Example:

$ export foo="one"
$ for i in foo bar baz; do
> eval echo \${$i:?}
> done
one
bash: bar: parameter null or not set
bash: baz: parameter null or not set
$

The : in the above operators can be omitted. Doing so changes the behavior of the operator so that it simply tests for the existence of the variable. This, in turn, causes the creation of a variable, for example:

$ export foo="this is a test"
$ echo $bar

$ echo ${foo=bar}
this is a test
$ echo ${bar=bar}
bar
$ echo $bar
bar
$          

These operators can be used in a variety of ways. A good example would be, in the case when no arguments are given, to give a default value to a variable normally read from command-line arguments. This example is demonstrated in the following script:

#!/bin/bash 
export INFILE=${1-"infile"} 
export OUTFILE=${2-"outfile"}
cat $INFILE > $OUTFILE

Copyright (c) 2005, 2000 by Pat Eyler. Originally published in Linux Gazette issue 57. Copyright (c) 2000, Specialized Systems Consultants, Inc. The material in this article may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later.

__________________________

--
-pate
http://on-ruby.blogspot.com


Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Ash's picture

: or not to :

On January 8th, 2007 Ash (not verified) says:

Interesting, ":" can be ommited for "numeric" variables (script/function arguments).
baz=${foo:-bar}
vs.
baz=${1-bar}
First time I thought it is a typo, but it is not.

mangoo's picture

interesting, this will save a few seds and greps!

On January 29th, 2008 mangoo (not verified) says:

Interesting article, this will save me a few seds, greps and awks!

MgBaMa req's picture

what if we are to operate on

On September 3rd, 2007 MgBaMa req (not verified) says:

what if we are to operate on the param $1 $2, ...?
i mean is it feasible to see a result of ${4%/*}
to get a valule as from
$ export $1="this is a test/none"
$ echo ${$1/*}
> this is a test
$ echo ${$2#*/}
> none
?
thanks

Paul Archerr's picture

variable contents confusing

On March 29th, 2006 Paul Archerr (not verified) says:

Minor typos not withstanding, I had a bit of a problem with the values of the variables used. assigning the value 'bar' to the variable bar makes it confusing to quickly figure out which is which. (Is that 'bar' another variable? Or a value?)
I would suggest making the simple change of putting your values in uppercase. They would stand out and make the article more readable.
For example:
$ export foo=foo
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't
$

becomes

$ export foo=FOO
$ echo ${foo}bar # foo exists so this works as expected
FOObar
$ echo $foobar # foobar doesn't exist, so this doesn't
$

You can see how the 'FOObar' on the third line becomes differentiated from the 'foobar' on the fourth line.

Anonymous's picture

foo, bar, +foobar: worst things for programming since Microsoft.

On September 10th, 2007 Anonymous (not verified) says:

Using foo and bar to try to inform someone is pretty much uniformly bad everywhere it's done, as foo and bar explicitly indicate things that don't have any meaning whatsoever. Foobar is even worse, since it is visibly only different from foo bar because of a single " " (space).

If you're trying to confuse the reader, use foo, bar, and especially foobar.

If you're trying to be informative, please, give your damn variables a short but logically useful name.

Anonymous's picture

Part II?

On March 24th, 2006 Anonymous (not verified) says:

OK but there's a lot more to it than just this. How about some of the following?

$(var:pos[:len]) # extract substr from pos (0-based) for len

$(var/substr/repl) # replace first match
$(var//substr/repl) # replace all matches
$(var/#substr/repl) # replace if matches at beginning (non-greedy)
$(var/##substr/repl) # replace if matches at beginning (greedy)
$(var/%substr/repl) # replace if matches at end (non-greedy)
$(var/%%substr/repl) # replace if matches at end (greedy)

${#var} # returns length of $var
${!var} # indirect expansion

Anonymous's picture

...Sorry, those round parens

On March 24th, 2006 Anonymous (not verified) says:

...Sorry, those round parens should be curlies.

Stephanie's picture

Interesting

On March 12th, 2006 Stephanie (not verified) says:

I think they author did a great job explaining the article and am glad that I was able to learn from it and finally found something interesting to read online!

Anonymous's picture

Examples in Table 1 are rubbish

On March 12th, 2006 Anonymous (not verified) says:

In addition to the incorrect $foo= (should be foo=), the last two examples don't illustrate the use of the construct they are supposed to . Pity the author did not proof-read the first table.

really-txtedmacs's picture

Examples using same operator yet differing results?!

On March 10th, 2006 really-txtedmacs (not verified) says:

${foo#t*is} deletes the shortest possible match from the left export $foo="this is a test" echo ${foo#t*is} is a test

fine, but next in line is supposed to remove the maximum from the left, but uses the same exact operator, how does it get the correct result?

${foo##t*is} deletes the longest possible match from the left export $foo="this is a test" echo ${foo#t*is} a test

Get's worse when going from the right, the original operation from the right is employed. Moreover, on my system an Ubuntu 05.10 descktop, this gave:

txtedmacs@phpserver:~$ export $foo="this is a test"
bash: export: `=this is a test': not a valid identifier

Take out the $foo, and it works fine.

Much easier to catch someone else's errors than one's own - I hate looking at my articles or emails.

Anonymous's picture

Errors in article

On March 13th, 2006 Anonymous (not verified) says:

As really-txtedmacs tried to politely point out, there are errors in the Pattern Matching table - Example column, as of when he looked at it and as of now. Each instance of "export $foo" should be "export foo" in bash and most similar shells. Also, the operator in the echo command needs to match exactly the operator in the first column. Interestingly, some but not all of these errors still exist in the original article at http://linuxgazette.net/issue57/eyler.html, which is in issue 57, not 67.
Otherwise, a very good article. I will save the info in my bag of tricks.

Jim Dennis's picture

You're channelling Larry Wall, dude!

On March 11th, 2006 Jim Dennis (not verified) says:

In Bourne shell and its ilk (like Korn shell and bash) the assignment syntax is: foo=... You only prefix a variable's name with $ when you're "dereferencing" it (expanding it into its value).

So the shell was parsing export $foo="this is a test" as:
export ???="this is a test" (where ??? is whatever "foo" was set to before this statement ... probably the empty string if the variable was previously unset).

I know this is confusing because Perl does it completely differently. In Perl the $ is a "sigil" which, on an "lvalue" (a variable name or other assignable token) tells the interpeter what "type" of assignment is occuring. Thus a Perl statement like: $foo="this is a test"; (note the required semicolon, too) is a "scalar" assignment. This also sets the context of the assignment. In Perl a scalar value in scalar context is conceptually the closest to a normal shell variable assignment. However, a list value in a scalar assignment context is a different beast entirely. So a line of Perl like perl -e '@bar=(1,2,3)]; $foo=@bar; print $foo ;' will set $foo to the number of items in the bar array. (Of course we could use @foo for the array name since they are different namespaces in Perl. But I wanted my example to be clear). So an array/list value in scalar context returns an integer (a type of scalar) which represents the number of elements in the list.

Anyway, just remembrer that the shell $ it more like the C programming * operator ... it dereferences the variable into its value.

JimD
The Linux Gazette "Answer Guy"

peter.green's picture

USA <> World :-)

On March 10th, 2006 peter.green says:

Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left.
In the USA, perhaps, but my UK keyboard has the # key nestling up against the Enter and right-Shift keys. Not to mention layouts such as Dvorak...!

Anonymous's picture

Other (non-USA-specific?!) Mnemonics

On March 18th, 2006 Anonymous (not verified) says:

Another way to keep track is that we say "#1" and "1%", not "1#" and "%1". That is, unless you're using "#" to mean "pounds", in which case "1#" is correct, but it's antiquated at best in the USA, and presumably a nonissue for other countries that use metric...

C programmers are used to using "#" at the start of lines (#define, #include). LaTeX authors are used to "%" at the end of lines when writing macro definitions, as a comment to keep extraneous whitespace from creeping in--but "%" is comment to end-of-line so it's also likely to show up at the start of a line too...

Island Joe's picture

Mnemonics

On March 21st, 2006 Island Joe (not verified) says:

Thanks for sharing those mnemonic insights, it's most helpful.

Post new comment

Please note that comments may not appear immediately, so there is no need to repost your comment.
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.

More information about formatting options

Newsletter

Each week Linux Journal editors will tell you what's hot in the world of Linux. You will receive late breaking news, technical tips and tricks, and links to in-depth stories featured on www.linuxjournal.com.
Sign up for our Email Newsletter

Tech Tip Videos

From the Magazine

July 2009, #183

News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.


To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .


All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.





Read this issue