Bash Regular Expressions
May 26th, 2008 by Mitch Frazier in
When working with regular expressions in a shell script the norm is to use grep or sed or some other external command/program. Since version 3 of bash (released in 2004) there is another option: bash's built-in regular expression comparison operator "=~".
Bash's regular expression comparison operator takes a string on the left and an extended regular expression on the right. It returns 0 (success) if the regular expression matches the string, otherwise it returns 1 (failure).
In addition to doing simple matching, bash regular expressions support sub-patterns surrounded by parenthesis for capturing parts of the match. The matches are assigned to an array variable BASH_REMATCH. The entire match is assigned to BASH_REMATCH[0], the first sub-pattern is assigned to BASH_REMATCH[1], etc..
The following example script takes a regular expression as its first argument and one or more strings to match against. It then cycles through the strings and outputs the results of the match process:
#!/bin.bash
if [[ $# -lt 2 ]]; then
echo "Usage: $0 PATTERN STRINGS..."
exit 1
fi
regex=$1
shift
echo "regex: $regex"
echo
while [[ $1 ]]
do
if [[ $1 =~ $regex ]]; then
echo "$1 matches"
i=1
n=${#BASH_REMATCH[*]}
while [[ $i -lt $n ]]
do
echo " capture[$i]: ${BASH_REMATCH[$i]}"
let i++
done
else
echo "$1 does not match"
fi
shift
done
Assuming the script is saved in "bashre.sh", the following sample shows its output:
# sh bashre.sh 'aa(b{2,3}[xyz])cc' aabbxcc aabbcc
regex: aa(b{2,3}[xyz])cc
aabbxcc matches
capture[1]: bbx
aabbcc does not match
__________________________
Mitch Frazier is an Associate Editor for Linux Journal and the Web Editor for linuxjournal.com.
Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Subscribe now!
The Latest
Newsletter
Tech Tip Videos
- Jul-01-09
- Jun-29-09
Recently Popular
From the Magazine
July 2009, #183
News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.
To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .
All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.
Delicious
Digg
StumbleUpon
Reddit
Facebook








bash-regular-expressions
On January 30th, 2009 kenb (not verified) says:
I wrote the suggested bash script and got the demonstrated result. However when I invoked:
sh bashre.sh 'aa(b{2,3}[xyz])cc' aabbxcc aabbcc aabbyccaabbbzcc
I expected to get two matches with the last parameter but I only got one. I'm surprised
that I'm the only one so what did I do wrong?
regex: aa(b{2,3}[xyz])cc
aabbxcc matches
capture[1]: bbx
aabbcc does not match
aabbyccaabbbzcc matches
capture[1]: bby
add color and better indentation to the output
On July 7th, 2008 Albert Bicchi (not verified) says:
#!/bin/sh if [[ $# -lt 2 ]]; then echo "Usage: regex PATTERN STRINGS..." exit 1 fi regex=$1 shift echo "regex: $regex" echo while [[ $1 ]] do if [[ $1 =~ $regex ]]; then echo -e "\t\E[42;37m${1} - matches\E[33;0m" i=1 n=${#BASH_REMATCH[*]} while [[ $i -lt $n ]] do echo -e "\t\t\E[43;37mcapture[$i]: ${BASH_REMATCH[$i]}\E[33;0m" let i++ done else echo -e "\t\E[41;37m${1} - does not match\E[33;0m" fi shift doneIs "(( $# < 2 ))" an
On June 25th, 2008 Anonymous (not verified) says:
Is "(( $# < 2 ))" an alternative conditional expression for the line "[[ $# -lt 2 ]]"?
Could you discuss BASH expressions with [[]] (()) and their valid operators. It seems the -lt, -gt, -a,
etc, can be replaced with <, >, &&, etc, if used with (()) --- replacing [] with (()) (numeric) and [[]] (strings).
Thank you.
PS: The captcha is really hard to read. It would be nice it there was an option to generate a new one that could possible be read by a mere human.
Is "(( $# < 2 ))" an
On June 25th, 2008 Anonymous (not verified) says:
Is "(( $# < 2 ))" an alternative conditional expression for the line "[[ $# -lt 2 ]]"?
Could you discuss BASH expressions with [[]] (()) and their valid operators. It seems the -lt, -gt, -a,
etc, can be replaced with <, >, &&, etc, if used with (()) --- replacing [] with (()) (numeric) and [[]] (strings).
Thank you.
That simple?!
On May 28th, 2008 Robert de Bock (not verified) says:
My god, Bash sure is a great tool! Thanks for the information.
Post new comment