Normalizing Path Names with Bash

 in

The bash function presented here normalizes path names. By normalize I mean it removes unneeded /./ and ../dir sequences. For example, ../d1/./d2/../f1 normalized would be ../d1/f1.

The first version of the function uses bash regular expressions. The /./ sequences are removed first during variable expansion with substitution by the line:

local   path=${1//\/.\//\/}

The dir/.. sequences are removed by the loop:

while [[ $path =~ ([^/][^/]*/\.\./) ]]
do
    path=${path/${BASH_REMATCH[0]}/}
done

Each time a dir/.. match is found, variable expansion with substitution is used to remove the matched part of the path.

Regular expressions were introduced in bash 3.0. Bash 3.2 changed regular expression handling slightly in that quotes around regular expressions became part of the regular expression. So, if you have a version of bash (with regular expression support) and the code doesn't work, put the regular expression in the while loop in quotes.

The entire function and some test code follows:

#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.

function normalize_path()
{
    # Remove all /./ sequences.
    local   path=${1//\/.\//\/}
    
    # Remove dir/.. sequences.
    while [[ $path =~ ([^/][^/]*/\.\./) ]]
    do
        path=${path/${BASH_REMATCH[0]}/}
    done
    echo $path
}

if [[ $(basename $0 .sh) == 'normalize_path' ]]; then
    if [[ "$*" ]]; then
        for p in "$@"
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    else
        for p in /test/../test/file test/../test/file .././test/../test/file
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    fi
fi


#####################################################################

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

Since, older versions of bash don't support regular expressions the second version does the same thing using sed instead:

#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.

function normalize_path()
{
    # Remove all /./ sequences.
    local   path=${1//\/.\//\/}
    
    # Remove first dir/.. sequence.
    local   npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
    
    # Remove remaining dir/.. sequence.
    while [[ $npath != $path ]]
    do
        path=$npath
        npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
    done
    echo $path
}

if [[ $(basename $(basename $0 .sh) .old) == 'normalize_path' ]]; then
    if [[ "$*" ]]; then
        for p in "$@"
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    else
        for p in /test/../test/file test/../test/file .././test/../test/file
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    fi
fi


#####################################################################

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

You can run the script directly and it runs a few tests:

$ bash normalize_path.sh
/test/../test/file             => /test/file
test/../test/file              => test/file
.././test/../test/file         => ../test/file

You can also pass in test cases on the command line:

$ bash normalize_path.sh ../d1/./d2/../f1 a/b/c/../d/../e
../d1/./d2/../f1               => ../d1/f1
a/b/c/../d/../e                => a/b/e

Normalized path names are never necessary but they're often easier to comprehend at a glance.

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Using POSIX index arrays

Jon Disnard's picture

I wrote about the same thing on my website.
https://sites.google.com/site/jdisnard/realpath

This is similar to what "S. Chauveau" except this is pure posix code.
Also, since the shell has positional parameters it does not require an array. Also this code will naively dereference a symbolic link.

No bash'isms, will work in KSH or any posix compliant shell.

real_path () {
  OIFS=$IFS
  IFS='/'
  for I in $1
  do
    # Resolve relative path punctuation.
    if [ "$I" = "." ] || [ -z "$I" ]
      then continue
    elif [ "$I" = ".." ]
      then FOO="${FOO%%/${FOO##*/}}"
           continue
      else FOO="${FOO}/${I}"
    fi

    # Dereference symbolic links.
    if [ -h "$FOO" ] && [ -x "/bin/ls" ]
      then IFS=$OIFS
           set `/bin/ls -l "$FOO"`
           while shift ;
           do
             if [ "$1" = "->" ]
               then FOO=$2
                    shift $#
                    break
             fi
           done
    fi
  done
  IFS=$OIFS
  echo "$FOO"
}

useful

Kendo's picture

You could also use parameter expansions for a portable solution, but I'm not sure how efficient that would be compared to a call to sed.

Absolute path

bhoejte's picture

I have never had cause to normalize a path, but I have had cause to need an absolute path. It incidentally also seem to normalize the path as well.

Made in ksh, but seems to work in bash also. Not entirely shell script, as sed is a heavy component.

function AbsolutePath
{
  ABSFILE="${1:-$0}"
  {
  if [[ "${ABSFILE#/}" = "${ABSFILE}" ]]; then
    echo ${PWD}/${ABSFILE}
  else
    echo ${ABSFILE}
  fi
  } | sed '
    :a
    s;/\./;/;g
    s;//;/;g
    s;/[^/][^/]*/\.\./;/;g
    ta'
  unset ABSFILE
}

I'm quite certain that a better solution now exists in Linux (I mean... there must be...), but this was made 15/11-2002 on Tru64, and has worked since. Just be careful not to use an ABSFILE variable, as it will be unset at the end (not enough experience to localize it).

Too much code

Shantanu gadgil's picture

I agree with previous poster ... I have never required this ... I rely on "readlink -f"

Arrays can also be used

S. Chauveau's picture

Another way to something like that in bash is by using arrays and substrings.
I don't put the complete code but here is the idea:

# use this to split the path into parts using / as separator
OLD_IFS="$IFS" ; IFS="/" ; PARTS=($DIR) ; IFS="$OLD_IFS"

# Detect an absolute path by checking the 1st character
if [ "${DIR:0:1}" == "/" ] ; then PREFIX="/" else PREFIX="" fi

# Process the array according to your needs
nb=${#PARTS[*]}
i=0
for ((i=0;i<nb;i++)) ; do
part="${PARTS[$i]}"
if [ ... ] ; then
unset PARTS[$i]
fi
done

# and finally reconstruct the path

IFS='/' ; DIR="$PREFIX${PARTS[*]}" ; IFS="$OLD_IFS"

The advantage of that code over the 'sed' version is that it does not require to spawn any unix process.
It is pure bash.

Simpler way

Anatoli Sakhnik's picture

For existing paths it'd be simpler to use `cd "$path" && pwd` or `readlink -f "$path"`.

Thank you for this suggestion

Anonymous Coward's picture

(cd $path && pwd ) is the suggestion I needed. Thanks!

I use that, but

TesserId's picture

I've used that a lot. Sadly, it won't normalize relative paths. :(

Nice but... I cannot

Anonymous's picture

Nice but... I cannot remember when in the past 8 years of me using unix-based operating systems, the need for such a script arose. No offense, though. Probably a good code example for shell scripting and regular expressions, but kind of "solution searches for problem".

+1

Ratso's picture

My thoughts exactly. I've read the article, and then in the end asked myself - when did I ever need this is my 10+ years of linux experience? Never! How can you even end up with 'bad' paths, if not by bad scripting? To me, this article is just another 'regex show off', which is sad - I almost never ever read other people regular expressions, there's no point - I either write them off from my head, or consult man pages some more and then write them ;) To me, there's no point in reading regex - you won't understand it if you don't now regex rules, but if you know the rules, you can write regex yourself. Only when bug-hunting I read "foreign" regex.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState