Bash Associative Arrays

The bash man page has long had the following bug listed: "It's too big and too slow" (at the very bottom of the man page). If you agree with that, then you probably won't want to read about the "new" associative arrays that were added in version 4.0 of bash. On the other hand, if you've ever used any modern Office Suite and seen code-bloat at its finest and just think the bash folks are exaggerating a bit, then read on.

There's nothing too surprising about associative arrays in bash, they are as you probably expect:

declare -A aa
aa[hello]=world
aa[ab]=cd

The -A option declares aa to be an associative array. Assignments are then made by putting the "key" inside the square brackets rather than an array index. You can also assign multiple items at once:

declare -A aa
aa=([hello]=world [ab]=cd)

Retrieving values is also as expected:

if [[ ${aa[hello]} == world ]]; then
    echo equal
fi
bb=${aa[hello]}

You can also use keys that contain spaces or other "strange" characters:

aa["hello world"]="from bash"

Note however that there appears to be a bug when assigning more than one item to an array with a parenthesis enclosed list if any of the keys have spaces in them. For example, consider the following script:

declare -A b
b=([hello]=world ["a b"]="c d")

for i in 1 2
do
    if [[ ${b["a b"]} == "c d" ]]; then
        echo $i: equals c d
    else
        echo $i: does not equal c d
    fi
    b["a b"]="c d"
done

At the top, b["a b"] is assigned a value as part of a parenthesis enclosed list of items. Inside the loop the if statement tests to see if the item is what we expect it to be. At the bottom of the loop the same value is assigned to the same key but using a "direct" assignment. Then the loop executes one more time. One would expect that the if test would succeed both times, however it does not:

$ bash ba.sh
1: does not equal c d
2: equals c d

You can see the problem if you add the following to the end of the script to print out all the keys:

for k in "${!b[@]}"
do
    echo "$k"
done

The result you get is:

$ bash ba.sh
1: does not equal c d
2: equals c d
a\ b
a b
hello

You can see here that the first assignment, the one done via the list incorrectly adds the key as a\ b rather than simply as a b.

Before ending I want to point out another feature that I just recently discovered about bash arrays: the ability to extend them with the += operator. This is actually the thing that lead me to the man page which then allowed me to discover the associative array feature. This is not a new feature, just new to me:

aa=(hello world)
aa+=(b c d)

After the += assignment the array will now contain 5 items, the values after the += having been appended to the end of the array. This also works with associative arrays.

aa=([hello]=world)
aa+=([b]=c)           # aa now contains 2 items

Note also that the += operator also works with regular variables and appends to the end of the current value.

aa="hello"
aa+=" world"          # aa is now "hello world"

For more on using bash arrays look at the man page or check out my earlier post.

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I like the bash man page

rikijpn's picture

I wouldn't hate an info page for bash (maybe having some examples too), but I certainly like its man page. It makes it very portable (a lot better than info pages probably), and easy to check on when you forget something, and I find the size rather "proper" for a such a full-featured shell.
I do like this new feature, and think it gives bash a wider range of uses and users.

It sure seems as though...

Joel Koltner's picture

...a full-on scripting/programming language such as Python, Perl, Ruby, etc. would be preferable in many cases by the time your program gets to the degree of complexity that using associative arrays becomes a good idea?

I suppose the appeal of bash it that it's ubiquitous -- even on embedded Linux systems, bash (albeit often in the form of busybox) will be available, but who knows which of the above-mentioned languages will be?

I just find bash difficult (meaning slow) to program/debug in compared to those other languages. :-)

---Joel

bug fixed

Anonymous's picture

The bug you mention has been fixed in or before bash 4.1.002-2

Thanks

Mitch Frazier's picture

I have 4.0.35. I sent in a bug report but I don't think it ever arrived since I never saw it on the bug mailing list.

Mitch Frazier is an Associate Editor for Linux Journal.

Nice post

Gal Frishman's picture

If anyone is into bash scripting, I wrote about bash self-reproducing code:
http://frishit.wordpress.com/2010/04/26/paradoxes-self-reproducing-code-and-bash

--
My blog: http://frishit.wordpress.com

Reminds me of my el days

Doug.Roberts's picture

Emacs LISP is so full-featured that I used to use it to do all kinds of programming tasks that I now do in Perl. I can see how it would be easy to get sucked in to using Bash's rich feature set for smallish yet complicated tasks.

One measure of Bash's feature set is the length of the man page: 5,368 lines!

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState