Bash Associative Arrays

The bash man page has long had the following bug listed: "It's too big and too slow" (at the very bottom of the man page). If you agree with that, then you probably won't want to read about the "new" associative arrays that were added in version 4.0 of bash. On the other hand, if you've ever used any modern Office Suite and seen code-bloat at its finest and just think the bash folks are exaggerating a bit, then read on.

There's nothing too surprising about associative arrays in bash, they are as you probably expect:

declare -A aa
aa[hello]=world
aa[ab]=cd

The -A option declares aa to be an associative array. Assignments are then made by putting the "key" inside the square brackets rather than an array index. You can also assign multiple items at once:

declare -A aa
aa=([hello]=world [ab]=cd)

Retrieving values is also as expected:

if [[ ${aa[hello]} == world ]]; then
    echo equal
fi
bb=${aa[hello]}

You can also use keys that contain spaces or other "strange" characters:

aa["hello world"]="from bash"

Note however that there appears to be a bug when assigning more than one item to an array with a parenthesis enclosed list if any of the keys have spaces in them. For example, consider the following script:

declare -A b
b=([hello]=world ["a b"]="c d")

for i in 1 2
do
    if [[ ${b["a b"]} == "c d" ]]; then
        echo $i: equals c d
    else
        echo $i: does not equal c d
    fi
    b["a b"]="c d"
done

At the top, b["a b"] is assigned a value as part of a parenthesis enclosed list of items. Inside the loop the if statement tests to see if the item is what we expect it to be. At the bottom of the loop the same value is assigned to the same key but using a "direct" assignment. Then the loop executes one more time. One would expect that the if test would succeed both times, however it does not:

$ bash ba.sh
1: does not equal c d
2: equals c d

You can see the problem if you add the following to the end of the script to print out all the keys:

for k in "${!b[@]}"
do
    echo "$k"
done

The result you get is:

$ bash ba.sh
1: does not equal c d
2: equals c d
a\ b
a b
hello

You can see here that the first assignment, the one done via the list incorrectly adds the key as a\ b rather than simply as a b.

Before ending I want to point out another feature that I just recently discovered about bash arrays: the ability to extend them with the += operator. This is actually the thing that lead me to the man page which then allowed me to discover the associative array feature. This is not a new feature, just new to me:

aa=(hello world)
aa+=(b c d)

After the += assignment the array will now contain 5 items, the values after the += having been appended to the end of the array. This also works with associative arrays.

aa=([hello]=world)
aa+=([b]=c)           # aa now contains 2 items

Note also that the += operator also works with regular variables and appends to the end of the current value.

aa="hello"
aa+=" world"          # aa is now "hello world"

For more on using bash arrays look at the man page or check out my earlier post.

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I like the bash man page

rikijpn's picture

I wouldn't hate an info page for bash (maybe having some examples too), but I certainly like its man page. It makes it very portable (a lot better than info pages probably), and easy to check on when you forget something, and I find the size rather "proper" for a such a full-featured shell.
I do like this new feature, and think it gives bash a wider range of uses and users.

It sure seems as though...

Joel Koltner's picture

...a full-on scripting/programming language such as Python, Perl, Ruby, etc. would be preferable in many cases by the time your program gets to the degree of complexity that using associative arrays becomes a good idea?

I suppose the appeal of bash it that it's ubiquitous -- even on embedded Linux systems, bash (albeit often in the form of busybox) will be available, but who knows which of the above-mentioned languages will be?

I just find bash difficult (meaning slow) to program/debug in compared to those other languages. :-)

---Joel

bug fixed

Anonymous's picture

The bug you mention has been fixed in or before bash 4.1.002-2

Thanks

Mitch Frazier's picture

I have 4.0.35. I sent in a bug report but I don't think it ever arrived since I never saw it on the bug mailing list.

Mitch Frazier is an Associate Editor for Linux Journal.

Nice post

Gal Frishman's picture

If anyone is into bash scripting, I wrote about bash self-reproducing code:
http://frishit.wordpress.com/2010/04/26/paradoxes-self-reproducing-code-and-bash

--
My blog: http://frishit.wordpress.com

Reminds me of my el days

Doug.Roberts's picture

Emacs LISP is so full-featured that I used to use it to do all kinds of programming tasks that I now do in Perl. I can see how it would be easy to get sucked in to using Bash's rich feature set for smallish yet complicated tasks.

One measure of Bash's feature set is the length of the man page: 5,368 lines!

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix