Variable Mangling in Bash with String Operators
March 13th, 2006 by Pat Eyler in
Editor's Note: This article has been updated by its author. Thank you, Pat.
Have you ever wanted to change the names of many files at once? Or, have you ever needed to use a default value for a variable that has no value? These and many other options are available to you when you use string operators in bash and other Bourne-derived shells.
String operators allow you to manipulate the contents of a variable without having to write your own shell functions to do so. They are provided through "curly brace" syntax. Any variable can be displayed as ${foo} without changing its meaning. This functionality often is used to protect a variable name from surrounding characters.
$ export foo=foo
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't
$
By the end of this article, you'll be able to use it for a whole lot more.
Three kinds of variable substitution are available for use: pattern matching, substitution and command substitution. I talk about the first two variables here and leave command substitution for another time.
In pattern matching, you can match from the left or from the right. The operators, along with their functions and examples, are shown below:
Operator: ${foo#t*is}
Function: deletes the shortest possible match from the left
Example:
$ export foo="this is a test"
$ echo ${foo#t*is}
is a test
$
Operator: ${foo##t*is}
Function: deletes the longest possible match from the left
Example:
$ export foo="this is a test"
$ echo ${foo##t*is}
a test
$
Operator: ${foo%t*st}
Function: deletes the shortest possible match from the right
Example:
$ export foo="this is a test"
$ echo ${foo%t*st}
this is a
$
Operator: ${foo%%t*st}
Function: deletes the longest possible match from the right
Example:
$ export foo="this is a test"
$ echo ${foo%%t*st}
$
Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left. The % key is on the right of the $ key and operates from the right. (This is true, at least, for US qwerty keyboards.)
The operators listed above can be used to do a variety of things. For example, the following script changes the extension of all .html files so they now are .htm files.
#!/bin/bash
# quickly convert html filenames for use on a dossy system
# only handles file extensions, not file names
for i in *.html; do
if [ -f ${i%l} ]; then
echo "${i%l} already exists"
else
mv $i ${i%l}
fi
done
Another kind of variable mangling you might want to employ is substitution. Four substitution operators are used in Bash, and they are shown below:
Operator: ${foo:-bar}
Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, return bar.
Example:
$ export foo=""
$ echo ${foo:-one}
one
$ echo $foo
$
Operator: ${foo:=bar}
Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, set $foo to bar and return bar.
Example:
$ export foo=""
$ echo ${foo:=one}
one
$ echo $foo
one
$
Operator: ${foo:+bar}
Function: If $foo exists and is not null, return bar. If it doesn't exist or is null, return a null.
Example:
$ export foo="this is a test"
$ echo ${foo:+bar}
bar
$
Operator: ${foo:?"error message"}
Function: If $foo exists and isn't null, return its value. If it doesn't exist or is null, print the error message. If no error message is given, it prints parameter null or not set. In a non-interactive shell, this aborts the current script. In an interactive shell, this simply prints the error message.
Example:
$ export foo="one"
$ for i in foo bar baz; do
> eval echo \${$i:?}
> done
one
bash: bar: parameter null or not set
bash: baz: parameter null or not set
$
The : in the above operators can be omitted. Doing so changes the behavior of the operator so that it simply tests for the existence of the variable. This, in turn, causes the creation of a variable, for example:
$ export foo="this is a test"
$ echo $bar
$ echo ${foo=bar}
this is a test
$ echo ${bar=bar}
bar
$ echo $bar
bar
$
These operators can be used in a variety of ways. A good example would be, in the case when no arguments are given, to give a default value to a variable normally read from command-line arguments. This example is demonstrated in the following script:
#!/bin/bash
export INFILE=${1-"infile"}
export OUTFILE=${2-"outfile"}
cat $INFILE > $OUTFILE
Copyright (c) 2005, 2000 by Pat Eyler. Originally published in Linux Gazette issue 57. Copyright (c) 2000, Specialized Systems Consultants, Inc. The material in this article may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later.
--
-pate
http://on-ruby.blogspot.com
Subscribe now!
Recently Popular
| What happens after TV's mainframe era ends next February? | Jul-05-08 |
| Why Python? | May-01-00 |
| Building a Call Center with LTSP and Soft Phones | Aug-25-05 |
| An Open Video to HP | Jul-02-08 |
| Chapter 16: Ubuntu and Your iPod | Aug-30-06 |
| Time to school the FCC on what "free" really means | Jul-02-08 |
Featured Video
From the Magazine
July 2008, #171
Heard of the Web? If not, read on. This month we talk with Matt Mullenweg about WordPress. If you want to get your hands dirty in Web code, take a look at the rest of our feature articles on WebKit, Dojo and OpenLaszlo.
In the rest of the issue, you'll find articles on OpenID, RDFa and Quanta Plus.
Kyle Rankin puts a new spin (as in "no" spin SSD) on hard drives and
also tells you how to migrate to that new disk (spinning or not).
Mick Bauer continues his series on customizing live CD's.
And, James Gray gives us a feel for the state of Linux in the enterprise.
After all that, you may need some TV time. If so, check out our review
on how to make that digital TV tuner card work in your Linux box.
Delicious
Digg
Reddit
Newsvine
Technorati






: or not to :
On January 8th, 2007 Ash (not verified) says:
Interesting, ":" can be ommited for "numeric" variables (script/function arguments).
baz=${foo:-bar}vs.
baz=${1-bar}First time I thought it is a typo, but it is not.
interesting, this will save a few seds and greps!
On January 29th, 2008 mangoo (not verified) says:
Interesting article, this will save me a few seds, greps and awks!
what if we are to operate on
On September 3rd, 2007 MgBaMa req (not verified) says:
what if we are to operate on the param $1 $2, ...?
i mean is it feasible to see a result of ${4%/*}
to get a valule as from
$ export $1="this is a test/none"
$ echo ${$1/*}
> this is a test
$ echo ${$2#*/}
> none
?
thanks
variable contents confusing
On March 29th, 2006 Paul Archerr (not verified) says:
Minor typos not withstanding, I had a bit of a problem with the values of the variables used. assigning the value 'bar' to the variable bar makes it confusing to quickly figure out which is which. (Is that 'bar' another variable? Or a value?)
I would suggest making the simple change of putting your values in uppercase. They would stand out and make the article more readable.
For example:
$ export foo=foo
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't
$
becomes
$ export foo=FOO
$ echo ${foo}bar # foo exists so this works as expected
FOObar
$ echo $foobar # foobar doesn't exist, so this doesn't
$
You can see how the 'FOObar' on the third line becomes differentiated from the 'foobar' on the fourth line.
foo, bar, +foobar: worst things for programming since Microsoft.
On September 10th, 2007 Anonymous (not verified) says:
Using foo and bar to try to inform someone is pretty much uniformly bad everywhere it's done, as foo and bar explicitly indicate things that don't have any meaning whatsoever. Foobar is even worse, since it is visibly only different from foo bar because of a single " " (space).
If you're trying to confuse the reader, use foo, bar, and especially foobar.
If you're trying to be informative, please, give your damn variables a short but logically useful name.
Part II?
On March 24th, 2006 Anonymous (not verified) says:
OK but there's a lot more to it than just this. How about some of the following?
$(var:pos[:len]) # extract substr from pos (0-based) for len
$(var/substr/repl) # replace first match
$(var//substr/repl) # replace all matches
$(var/#substr/repl) # replace if matches at beginning (non-greedy)
$(var/##substr/repl) # replace if matches at beginning (greedy)
$(var/%substr/repl) # replace if matches at end (non-greedy)
$(var/%%substr/repl) # replace if matches at end (greedy)
${#var} # returns length of $var
${!var} # indirect expansion
...Sorry, those round parens
On March 24th, 2006 Anonymous (not verified) says:
...Sorry, those round parens should be curlies.
Interesting
On March 12th, 2006 Stephanie (not verified) says:
I think they author did a great job explaining the article and am glad that I was able to learn from it and finally found something interesting to read online!
Examples in Table 1 are rubbish
On March 12th, 2006 Anonymous (not verified) says:
In addition to the incorrect $foo= (should be foo=), the last two examples don't illustrate the use of the construct they are supposed to . Pity the author did not proof-read the first table.
Examples using same operator yet differing results?!
On March 10th, 2006 really-txtedmacs (not verified) says:
${foo#t*is} deletes the shortest possible match from the left export $foo="this is a test" echo ${foo#t*is} is a test
fine, but next in line is supposed to remove the maximum from the left, but uses the same exact operator, how does it get the correct result?
${foo##t*is} deletes the longest possible match from the left export $foo="this is a test" echo ${foo#t*is} a test
Get's worse when going from the right, the original operation from the right is employed. Moreover, on my system an Ubuntu 05.10 descktop, this gave:
txtedmacs@phpserver:~$ export $foo="this is a test"
bash: export: `=this is a test': not a valid identifier
Take out the $foo, and it works fine.
Much easier to catch someone else's errors than one's own - I hate looking at my articles or emails.
Errors in article
On March 13th, 2006 Anonymous (not verified) says:
As really-txtedmacs tried to politely point out, there are errors in the Pattern Matching table - Example column, as of when he looked at it and as of now. Each instance of "export $foo" should be "export foo" in bash and most similar shells. Also, the operator in the echo command needs to match exactly the operator in the first column. Interestingly, some but not all of these errors still exist in the original article at http://linuxgazette.net/issue57/eyler.html, which is in issue 57, not 67.
Otherwise, a very good article. I will save the info in my bag of tricks.
You're channelling Larry Wall, dude!
On March 11th, 2006 Jim Dennis (not verified) says:
In Bourne shell and its ilk (like Korn shell and bash) the assignment syntax is:
foo=...You only prefix a variable's name with $ when you're "dereferencing" it (expanding it into its value).So the shell was parsing
export $foo="this is a test"as:export ???="this is a test"(where ??? is whatever "foo" was set to before this statement ... probably the empty string if the variable was previously unset).I know this is confusing because Perl does it completely differently. In Perl the $ is a "sigil" which, on an "lvalue" (a variable name or other assignable token) tells the interpeter what "type" of assignment is occuring. Thus a Perl statement like:
$foo="this is a test";(note the required semicolon, too) is a "scalar" assignment. This also sets the context of the assignment. In Perl a scalar value in scalar context is conceptually the closest to a normal shell variable assignment. However, a list value in a scalar assignment context is a different beast entirely. So a line of Perl likeperl -e '@bar=(1,2,3)]; $foo=@bar; print $foo ;'will set $foo to the number of items in the bar array. (Of course we could use @foo for the array name since they are different namespaces in Perl. But I wanted my example to be clear). So an array/list value in scalar context returns an integer (a type of scalar) which represents the number of elements in the list.Anyway, just remembrer that the shell $ it more like the C programming * operator ... it dereferences the variable into its value.
JimD
The Linux Gazette "Answer Guy"
USA <> World :-)
On March 10th, 2006 peter.green says:
Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left.
In the USA, perhaps, but my UK keyboard has the # key nestling up against the Enter and right-Shift keys. Not to mention layouts such as Dvorak...!
Other (non-USA-specific?!) Mnemonics
On March 18th, 2006 Anonymous (not verified) says:
Another way to keep track is that we say "#1" and "1%", not "1#" and "%1". That is, unless you're using "#" to mean "pounds", in which case "1#" is correct, but it's antiquated at best in the USA, and presumably a nonissue for other countries that use metric...
C programmers are used to using "#" at the start of lines (#define, #include). LaTeX authors are used to "%" at the end of lines when writing macro definitions, as a comment to keep extraneous whitespace from creeping in--but "%" is comment to end-of-line so it's also likely to show up at the start of a line too...
Mnemonics
On March 21st, 2006 Island Joe (not verified) says:
Thanks for sharing those mnemonic insights, it's most helpful.