Variable Mangling in Bash with String Operators

Here's a quick and updated HOWTO for using string operators in bash to manipulate variables.

Editor's Note: This article has been updated by its author. Thank you, Pat.

Have you ever wanted to change the names of many files at once? Or, have you ever needed to use a default value for a variable that has no value? These and many other options are available to you when you use string operators in bash and other Bourne-derived shells.

String operators allow you to manipulate the contents of a variable without having to write your own shell functions to do so. They are provided through "curly brace" syntax. Any variable can be displayed as ${foo} without changing its meaning. This functionality often is used to protect a variable name from surrounding characters.

$ export foo=foo 
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't

$

By the end of this article, you'll be able to use it for a whole lot more.

Three kinds of variable substitution are available for use: pattern matching, substitution and command substitution. I talk about the first two variables here and leave command substitution for another time.

Pattern Matching

In pattern matching, you can match from the left or from the right. The operators, along with their functions and examples, are shown below:

Operator: ${foo#t*is}

Function: deletes the shortest possible match from the left

Example:

$ export foo="this is a test"
$ echo ${foo#t*is}
is a test
$

Operator: ${foo##t*is}

Function: deletes the longest possible match from the left

Example:

$ export foo="this is a test"
$ echo ${foo##t*is}
a test
$

Operator: ${foo%t*st}

Function: deletes the shortest possible match from the right

Example:

$ export foo="this is a test"
$ echo ${foo%t*st}
this is a
$

Operator: ${foo%%t*st}

Function: deletes the longest possible match from the right

Example:

$ export foo="this is a test"
$ echo ${foo%%t*st}

$

Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left. The % key is on the right of the $ key and operates from the right. (This is true, at least, for US qwerty keyboards.)

The operators listed above can be used to do a variety of things. For example, the following script changes the extension of all .html files so they now are .htm files.

#!/bin/bash 
# quickly convert html filenames for use on a dossy system
# only handles file extensions, not file names
for i in *.html; do
   if [ -f ${i%l} ]; then 
      echo "${i%l} already exists"
   else 
      mv $i ${i%l}
   fi 
done
Substitution

Another kind of variable mangling you might want to employ is substitution. Four substitution operators are used in Bash, and they are shown below:

Operator: ${foo:-bar}

Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, return bar.

Example:

$ export foo=""
$ echo ${foo:-one}
one
$ echo $foo

$

Operator: ${foo:=bar}

Function: If $foo exists and is not null, return $foo. If it doesn't exist or is null, set $foo to bar and return bar.

Example:

$ export foo=""
$ echo ${foo:=one}
one

$ echo $foo
one
$

Operator: ${foo:+bar}

Function: If $foo exists and is not null, return bar. If it doesn't exist or is null, return a null.

Example:

$ export foo="this is a test"
$ echo ${foo:+bar}
bar
$

Operator: ${foo:?"error message"}

Function: If $foo exists and isn't null, return its value. If it doesn't exist or is null, print the error message. If no error message is given, it prints parameter null or not set. In a non-interactive shell, this aborts the current script. In an interactive shell, this simply prints the error message.

Example:

$ export foo="one"
$ for i in foo bar baz; do
> eval echo \${$i:?}
> done
one
bash: bar: parameter null or not set
bash: baz: parameter null or not set
$

The : in the above operators can be omitted. Doing so changes the behavior of the operator so that it simply tests for the existence of the variable. This, in turn, causes the creation of a variable, for example:

$ export foo="this is a test"
$ echo $bar

$ echo ${foo=bar}
this is a test
$ echo ${bar=bar}
bar
$ echo $bar
bar
$          

These operators can be used in a variety of ways. A good example would be, in the case when no arguments are given, to give a default value to a variable normally read from command-line arguments. This example is demonstrated in the following script:

#!/bin/bash 
export INFILE=${1-"infile"} 
export OUTFILE=${2-"outfile"}
cat $INFILE > $OUTFILE

Copyright (c) 2005, 2000 by Pat Eyler. Originally published in Linux Gazette issue 57. Copyright (c) 2000, Specialized Systems Consultants, Inc. The material in this article may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later.

______________________

-- -pate http://on-ruby.blogspot.com

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

String operations the oo-way in bash

Anonymous's picture

Try oobash. It is an oo-style string library for bash 4. It has support for german umlauts. It is written in bash. Many functions are available: -base64Decode, -base64Encode, -capitalize, -center, -charAt, -concat, -contains, -count, -endsWith, -equals, -equalsIgnoreCase, -reverse, -hashCode, -indexOf, -isAlnum, -isAlpha, -isAscii, -isDigit, -isEmpty, -isHexDigit, -isLowerCase, -isSpace, -isPrintable, -isUpperCase, -isVisible, -lastIndexOf, -length, -matches, -replaceAll, -replaceFirst, -startsWith, -substring, -swapCase, -toLowerCase, -toString, -toUpperCase, -trim, and -zfill.

Look at the contains example:
Call the constructor:
[Desktop]$ String a testXccc
Now use object.method:

[Desktop]$ a.contains tX
true
[Desktop]$ a.contains XtX
false

http://sourceforge.net/projects/oobash/

bash string replacment with null character or new line character

Anonymous's picture

I'm having trouble with the string replacement function in bash. The problem is that I want to replace a printing character, in this case &, by a non-printing character, either new line or null in this case. I don't see how to specify the non-printing character in the string replacement function ${variable//a/b}.

I have a long, URL-encoded-like file name that I would like to parse with grep. I have used & as a delimiter between variables within the long file name. I would like to use the string replacement function in bash to search for all instances of & and replace each one with either the null character or the new line character since grep can recognize either one.

How do I specify a non-printing character in the bash string replacement function ?

Thank you.

Special Syntax

Mitch Frazier's picture

Use the $'\xNN' syntax for the non-printing character. Note though that a NULL character does not work:

$ cat j.sh

v="hello=yes&world=no"

v2=${v/&/$'\x0a'}
#        ^^^^^^^    change to newline
echo -n ">>$v2<<" | hexdump -C

v2=${v/&/$'\x00'}
#        ^^^^^^^    change to null (doesn't work)
echo -n ">>$v2<<" | hexdump -C

If you run this you can see that the substitution works for a newline but not for a NULL:

$ sh j.sh
00000000  3e 3e 68 65 6c 6c 6f 3d  79 65 73 0a 77 6f 72 6c  |>>hello=yes.worl|
00000010  64 3d 6e 6f 3c 3c                                 |d=no<<|
00000016
00000000  3e 3e 68 65 6c 6c 6f 3d  79 65 73 77 6f 72 6c 64  |>>hello=yesworld|
00000010  3d 6e 6f 3c 3c                                    |=no<<|
00000015

Mitch Frazier is an Associate Editor for Linux Journal.

Multiple operations?

Anonymous's picture

Very interesting article.
A question: is it possible to use in the same expression many operators, as:
${foo#t*is%t*st} which uses both '#t*is' and '%t*st' which gives 'is a' in the example?
I tried some forms but it doesn't work... Has someone an idea?

Doesn't Work

Mitch Frazier's picture

You can't do multiple operations in one expression.

Mitch Frazier is an Associate Editor for Linux Journal.

Variable

First question's picture

I want to do something like this using linux bash script:
a1="Chris Alonso"
i="1"
echo $a$i #I only trying to write: echo $a1 using the variable i

Someone can help me, please?

Eval

Mitch Frazier's picture

Eval will do this for you but you may decide you really don't want to do this after seeing it:

  eval echo \$$(echo a$i)
or
  eval echo \$`echo a$i`

A slightly less complicated sequence would be something like:

  v=a$i
  eval echo \$$v

It looks like what you're trying to do here is simulate arrays. If that's the case then you'd be better or using bash's built-in arrays.

Mitch Frazier is an Associate Editor for Linux Journal.

How about v=a$i echo ${!v}

Anonymous's picture

How about

v=a$i
echo ${!v}

simplification of indirect reference

Anonymous's picture

Is there any way to rid the statements of the variable assignment? As in, make it so that:

echo ${!a$i}

works? I'm thinking that there has to be a way to escape the "a$i" inside the indirect reference construct. I have a case where I'm trying to do this with the result of a regex match, and am not able to figure out the right syntax:

SRC_FOLDER=/var/website
needleA_FOLDER=/var/www
needleB_FOLDER=/var/htdocs

for item in ${ARRAY[@]; do
    [[ "$item" =~ hay(needle)stack ]] && 
       DIR=${!${BASH_REMATCH[1]}_FOLDER};
    cp -R $SRC_FOLDER/* $DIR;
done;

But the seventh line (with the indirect reference) chokes with a "bad substitution" error. I should be able to do this on one line, without using eval with the right syntax, no?

Sincerely,
Tyler

Yes

Mitch Frazier's picture

That works and is simpler than my solution.

Mitch Frazier is an Associate Editor for Linux Journal.

: or not to :

Ash's picture

Interesting, ":" can be ommited for "numeric" variables (script/function arguments).
baz=${foo:-bar}
vs.
baz=${1-bar}
First time I thought it is a typo, but it is not.

interesting, this will save a few seds and greps!

mangoo's picture

Interesting article, this will save me a few seds, greps and awks!

what if we are to operate on

MgBaMa req's picture

what if we are to operate on the param $1 $2, ...?
i mean is it feasible to see a result of ${4%/*}
to get a valule as from
$ export $1="this is a test/none"
$ echo ${$1/*}
> this is a test
$ echo ${$2#*/}
> none
?
thanks

variable contents confusing

Paul Archerr's picture

Minor typos not withstanding, I had a bit of a problem with the values of the variables used. assigning the value 'bar' to the variable bar makes it confusing to quickly figure out which is which. (Is that 'bar' another variable? Or a value?)
I would suggest making the simple change of putting your values in uppercase. They would stand out and make the article more readable.
For example:
$ export foo=foo
$ echo ${foo}bar # foo exists so this works as expected
foobar
$ echo $foobar # foobar doesn't exist, so this doesn't
$

becomes

$ export foo=FOO
$ echo ${foo}bar # foo exists so this works as expected
FOObar
$ echo $foobar # foobar doesn't exist, so this doesn't
$

You can see how the 'FOObar' on the third line becomes differentiated from the 'foobar' on the fourth line.

foo, bar, +foobar: worst things for programming since Microsoft.

Anonymous's picture

Using foo and bar to try to inform someone is pretty much uniformly bad everywhere it's done, as foo and bar explicitly indicate things that don't have any meaning whatsoever. Foobar is even worse, since it is visibly only different from foo bar because of a single " " (space).

If you're trying to confuse the reader, use foo, bar, and especially foobar.

If you're trying to be informative, please, give your damn variables a short but logically useful name.

Part II?

Anonymous's picture

OK but there's a lot more to it than just this. How about some of the following?

$(var:pos[:len]) # extract substr from pos (0-based) for len

$(var/substr/repl) # replace first match
$(var//substr/repl) # replace all matches
$(var/#substr/repl) # replace if matches at beginning (non-greedy)
$(var/##substr/repl) # replace if matches at beginning (greedy)
$(var/%substr/repl) # replace if matches at end (non-greedy)
$(var/%%substr/repl) # replace if matches at end (greedy)

${#var} # returns length of $var
${!var} # indirect expansion

...Sorry, those round parens

Anonymous's picture

...Sorry, those round parens should be curlies.

Interesting

Stephanie's picture

I think they author did a great job explaining the article and am glad that I was able to learn from it and finally found something interesting to read online!

Examples in Table 1 are rubbish

Anonymous's picture

In addition to the incorrect $foo= (should be foo=), the last two examples don't illustrate the use of the construct they are supposed to . Pity the author did not proof-read the first table.

Examples using same operator yet differing results?!

really-txtedmacs's picture

${foo#t*is} deletes the shortest possible match from the left export $foo="this is a test" echo ${foo#t*is} is a test

fine, but next in line is supposed to remove the maximum from the left, but uses the same exact operator, how does it get the correct result?

${foo##t*is} deletes the longest possible match from the left export $foo="this is a test" echo ${foo#t*is} a test

Get's worse when going from the right, the original operation from the right is employed. Moreover, on my system an Ubuntu 05.10 descktop, this gave:

txtedmacs@phpserver:~$ export $foo="this is a test"
bash: export: `=this is a test': not a valid identifier

Take out the $foo, and it works fine.

Much easier to catch someone else's errors than one's own - I hate looking at my articles or emails.

Errors in article

Anonymous's picture

As really-txtedmacs tried to politely point out, there are errors in the Pattern Matching table - Example column, as of when he looked at it and as of now. Each instance of "export $foo" should be "export foo" in bash and most similar shells. Also, the operator in the echo command needs to match exactly the operator in the first column. Interestingly, some but not all of these errors still exist in the original article at http://linuxgazette.net/issue57/eyler.html, which is in issue 57, not 67.
Otherwise, a very good article. I will save the info in my bag of tricks.

You're channelling Larry Wall, dude!

Jim Dennis's picture

In Bourne shell and its ilk (like Korn shell and bash) the assignment syntax is: foo=... You only prefix a variable's name with $ when you're "dereferencing" it (expanding it into its value).

So the shell was parsing export $foo="this is a test" as:
export ???="this is a test" (where ??? is whatever "foo" was set to before this statement ... probably the empty string if the variable was previously unset).

I know this is confusing because Perl does it completely differently. In Perl the $ is a "sigil" which, on an "lvalue" (a variable name or other assignable token) tells the interpeter what "type" of assignment is occuring. Thus a Perl statement like: $foo="this is a test"; (note the required semicolon, too) is a "scalar" assignment. This also sets the context of the assignment. In Perl a scalar value in scalar context is conceptually the closest to a normal shell variable assignment. However, a list value in a scalar assignment context is a different beast entirely. So a line of Perl like perl -e '@bar=(1,2,3)]; $foo=@bar; print $foo ;' will set $foo to the number of items in the bar array. (Of course we could use @foo for the array name since they are different namespaces in Perl. But I wanted my example to be clear). So an array/list value in scalar context returns an integer (a type of scalar) which represents the number of elements in the list.

Anyway, just remembrer that the shell $ it more like the C programming * operator ... it dereferences the variable into its value.

JimD
The Linux Gazette "Answer Guy"

USA <> World :-)

peter.green's picture

Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left.
In the USA, perhaps, but my UK keyboard has the # key nestling up against the Enter and right-Shift keys. Not to mention layouts such as Dvorak...!

Other (non-USA-specific?!) Mnemonics

Anonymous's picture

Another way to keep track is that we say "#1" and "1%", not "1#" and "%1". That is, unless you're using "#" to mean "pounds", in which case "1#" is correct, but it's antiquated at best in the USA, and presumably a nonissue for other countries that use metric...

C programmers are used to using "#" at the start of lines (#define, #include). LaTeX authors are used to "%" at the end of lines when writing macro definitions, as a comment to keep extraneous whitespace from creeping in--but "%" is comment to end-of-line so it's also likely to show up at the start of a line too...

Mnemonics

Island Joe's picture

Thanks for sharing those mnemonic insights, it's most helpful.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix