Normalizing Path Names with Bash
The bash function presented here normalizes path names. By normalize I mean it removes unneeded /./ and ../dir sequences. For example, ../d1/./d2/../f1 normalized would be ../d1/f1.
The first version of the function uses bash regular expressions. The /./ sequences are removed first during variable expansion with substitution by the line:
local path=${1//\/.\//\/}
The dir/.. sequences are removed by the loop:
while [[ $path =~ ([^/][^/]*/\.\./) ]]
do
path=${path/${BASH_REMATCH[0]}/}
done
Each time a dir/.. match is found, variable expansion with substitution is used to remove the matched part of the path.
Regular expressions were introduced in bash 3.0. Bash 3.2 changed regular expression handling slightly in that quotes around regular expressions became part of the regular expression. So, if you have a version of bash (with regular expression support) and the code doesn't work, put the regular expression in the while loop in quotes.
The entire function and some test code follows:
#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.
function normalize_path()
{
# Remove all /./ sequences.
local path=${1//\/.\//\/}
# Remove dir/.. sequences.
while [[ $path =~ ([^/][^/]*/\.\./) ]]
do
path=${path/${BASH_REMATCH[0]}/}
done
echo $path
}
if [[ $(basename $0 .sh) == 'normalize_path' ]]; then
if [[ "$*" ]]; then
for p in "$@"
do
printf "%-30s => %s\n" $p $(normalize_path $p)
done
else
for p in /test/../test/file test/../test/file .././test/../test/file
do
printf "%-30s => %s\n" $p $(normalize_path $p)
done
fi
fi
#####################################################################
# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;
Since, older versions of bash don't support regular expressions the second version does the same thing using sed instead:
#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.
function normalize_path()
{
# Remove all /./ sequences.
local path=${1//\/.\//\/}
# Remove first dir/.. sequence.
local npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
# Remove remaining dir/.. sequence.
while [[ $npath != $path ]]
do
path=$npath
npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
done
echo $path
}
if [[ $(basename $(basename $0 .sh) .old) == 'normalize_path' ]]; then
if [[ "$*" ]]; then
for p in "$@"
do
printf "%-30s => %s\n" $p $(normalize_path $p)
done
else
for p in /test/../test/file test/../test/file .././test/../test/file
do
printf "%-30s => %s\n" $p $(normalize_path $p)
done
fi
fi
#####################################################################
# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;
You can run the script directly and it runs a few tests:
$ bash normalize_path.sh
/test/../test/file => /test/file
test/../test/file => test/file
.././test/../test/file => ../test/file
You can also pass in test cases on the command line:
$ bash normalize_path.sh ../d1/./d2/../f1 a/b/c/../d/../e
../d1/./d2/../f1 => ../d1/f1
a/b/c/../d/../e => a/b/e
Normalized path names are never necessary but they're often easier to comprehend at a glance.
Mitch Frazier is an Associate Editor for Linux Journal.
Trending Topics
| You Need A Budget | Feb 10, 2012 |
| The Linux powered LAN Gaming House | Feb 08, 2012 |
| Creating a vDSO: the Colonel's Other Chicken | Feb 06, 2012 |
| Your CMS Is Not Your Web Site | Feb 01, 2012 |
| Casper, the Friendly (and Persistent) Ghost | Jan 31, 2012 |
| Razor-qt 0.4 - Qt based Desktop Environment | Jan 30, 2012 |
- Fun with ethtool
- Linux-Based X Terminals with XDMCP
- Readers' Choice Awards 2011
- 100% disappointed with the decision to go all digital.
- Parallel Programming with NVIDIA CUDA
- You Need A Budget
- Validate an E-Mail Address with PHP, the Right Way
- The Linux powered LAN Gaming House
- The Linux RAID-1, 4, 5 Code
- Python for Android
- Gnome3 is such a POS. No one
3 hours 39 min ago - Gnome 3 is the biggest POS
3 hours 50 min ago - I didn't knew this thing by
9 hours 54 min ago - Author's reply
13 hours 19 min ago - Link to modlys
14 hours 25 min ago - I use YNAB because of the
14 hours 37 min ago - Search
19 hours 40 min ago - Question
20 hours 3 min ago - for the record
20 hours 6 min ago - That's disappointing. Thanks
22 hours 29 min ago





Comments
Using POSIX index arrays
I wrote about the same thing on my website.
https://sites.google.com/site/jdisnard/realpath
This is similar to what "S. Chauveau" except this is pure posix code.
Also, since the shell has positional parameters it does not require an array. Also this code will naively dereference a symbolic link.
No bash'isms, will work in KSH or any posix compliant shell.
real_path () { OIFS=$IFS IFS='/' for I in $1 do # Resolve relative path punctuation. if [ "$I" = "." ] || [ -z "$I" ] then continue elif [ "$I" = ".." ] then FOO="${FOO%%/${FOO##*/}}" continue else FOO="${FOO}/${I}" fi # Dereference symbolic links. if [ -h "$FOO" ] && [ -x "/bin/ls" ] then IFS=$OIFS set `/bin/ls -l "$FOO"` while shift ; do if [ "$1" = "->" ] then FOO=$2 shift $# break fi done fi done IFS=$OIFS echo "$FOO" }useful
You could also use parameter expansions for a portable solution, but I'm not sure how efficient that would be compared to a call to sed.
Absolute path
I have never had cause to normalize a path, but I have had cause to need an absolute path. It incidentally also seem to normalize the path as well.
Made in ksh, but seems to work in bash also. Not entirely shell script, as sed is a heavy component.
function AbsolutePath { ABSFILE="${1:-$0}" { if [[ "${ABSFILE#/}" = "${ABSFILE}" ]]; then echo ${PWD}/${ABSFILE} else echo ${ABSFILE} fi } | sed ' :a s;/\./;/;g s;//;/;g s;/[^/][^/]*/\.\./;/;g ta' unset ABSFILE }I'm quite certain that a better solution now exists in Linux (I mean... there must be...), but this was made 15/11-2002 on Tru64, and has worked since. Just be careful not to use an ABSFILE variable, as it will be unset at the end (not enough experience to localize it).
Too much code
I agree with previous poster ... I have never required this ... I rely on "readlink -f"
Arrays can also be used
Another way to something like that in bash is by using arrays and substrings.
I don't put the complete code but here is the idea:
# use this to split the path into parts using / as separator
OLD_IFS="$IFS" ; IFS="/" ; PARTS=($DIR) ; IFS="$OLD_IFS"
# Detect an absolute path by checking the 1st character
if [ "${DIR:0:1}" == "/" ] ; then PREFIX="/" else PREFIX="" fi
# Process the array according to your needs
nb=${#PARTS[*]}
i=0
for ((i=0;i<nb;i++)) ; do
part="${PARTS[$i]}"
if [ ... ] ; then
unset PARTS[$i]
fi
done
# and finally reconstruct the path
IFS='/' ; DIR="$PREFIX${PARTS[*]}" ; IFS="$OLD_IFS"
The advantage of that code over the 'sed' version is that it does not require to spawn any unix process.
It is pure bash.
Simpler way
For existing paths it'd be simpler to use `cd "$path" && pwd` or `readlink -f "$path"`.
Thank you for this suggestion
(cd $path && pwd ) is the suggestion I needed. Thanks!
I use that, but
I've used that a lot. Sadly, it won't normalize relative paths. :(
Nice but... I cannot
Nice but... I cannot remember when in the past 8 years of me using unix-based operating systems, the need for such a script arose. No offense, though. Probably a good code example for shell scripting and regular expressions, but kind of "solution searches for problem".
+1
My thoughts exactly. I've read the article, and then in the end asked myself - when did I ever need this is my 10+ years of linux experience? Never! How can you even end up with 'bad' paths, if not by bad scripting? To me, this article is just another 'regex show off', which is sad - I almost never ever read other people regular expressions, there's no point - I either write them off from my head, or consult man pages some more and then write them ;) To me, there's no point in reading regex - you won't understand it if you don't now regex rules, but if you know the rules, you can write regex yourself. Only when bug-hunting I read "foreign" regex.