Normalizing Path Names with Bash

 in

The bash function presented here normalizes path names. By normalize I mean it removes unneeded /./ and ../dir sequences. For example, ../d1/./d2/../f1 normalized would be ../d1/f1.

The first version of the function uses bash regular expressions. The /./ sequences are removed first during variable expansion with substitution by the line:

local   path=${1//\/.\//\/}

The dir/.. sequences are removed by the loop:

while [[ $path =~ ([^/][^/]*/\.\./) ]]
do
    path=${path/${BASH_REMATCH[0]}/}
done

Each time a dir/.. match is found, variable expansion with substitution is used to remove the matched part of the path.

Regular expressions were introduced in bash 3.0. Bash 3.2 changed regular expression handling slightly in that quotes around regular expressions became part of the regular expression. So, if you have a version of bash (with regular expression support) and the code doesn't work, put the regular expression in the while loop in quotes.

The entire function and some test code follows:

#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.

function normalize_path()
{
    # Remove all /./ sequences.
    local   path=${1//\/.\//\/}
    
    # Remove dir/.. sequences.
    while [[ $path =~ ([^/][^/]*/\.\./) ]]
    do
        path=${path/${BASH_REMATCH[0]}/}
    done
    echo $path
}

if [[ $(basename $0 .sh) == 'normalize_path' ]]; then
    if [[ "$*" ]]; then
        for p in "$@"
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    else
        for p in /test/../test/file test/../test/file .././test/../test/file
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    fi
fi


#####################################################################

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

Since, older versions of bash don't support regular expressions the second version does the same thing using sed instead:

#!/bin/bash
#
# Usage: normalize_path PATH
#
# Remove /./ and dir/.. sequences from a pathname and write result to stdout.

function normalize_path()
{
    # Remove all /./ sequences.
    local   path=${1//\/.\//\/}
    
    # Remove first dir/.. sequence.
    local   npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
    
    # Remove remaining dir/.. sequence.
    while [[ $npath != $path ]]
    do
        path=$npath
        npath=$(echo $path | sed -e 's;[^/][^/]*/\.\./;;')
    done
    echo $path
}

if [[ $(basename $(basename $0 .sh) .old) == 'normalize_path' ]]; then
    if [[ "$*" ]]; then
        for p in "$@"
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    else
        for p in /test/../test/file test/../test/file .././test/../test/file
        do
            printf "%-30s => %s\n" $p $(normalize_path $p)
        done
    fi
fi


#####################################################################

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

You can run the script directly and it runs a few tests:

$ bash normalize_path.sh
/test/../test/file             => /test/file
test/../test/file              => test/file
.././test/../test/file         => ../test/file

You can also pass in test cases on the command line:

$ bash normalize_path.sh ../d1/./d2/../f1 a/b/c/../d/../e
../d1/./d2/../f1               => ../d1/f1
a/b/c/../d/../e                => a/b/e

Normalized path names are never necessary but they're often easier to comprehend at a glance.

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Using POSIX index arrays

Jon Disnard's picture

I wrote about the same thing on my website.
https://sites.google.com/site/jdisnard/realpath

This is similar to what "S. Chauveau" except this is pure posix code.
Also, since the shell has positional parameters it does not require an array. Also this code will naively dereference a symbolic link.

No bash'isms, will work in KSH or any posix compliant shell.

real_path () {
  OIFS=$IFS
  IFS='/'
  for I in $1
  do
    # Resolve relative path punctuation.
    if [ "$I" = "." ] || [ -z "$I" ]
      then continue
    elif [ "$I" = ".." ]
      then FOO="${FOO%%/${FOO##*/}}"
           continue
      else FOO="${FOO}/${I}"
    fi

    # Dereference symbolic links.
    if [ -h "$FOO" ] && [ -x "/bin/ls" ]
      then IFS=$OIFS
           set `/bin/ls -l "$FOO"`
           while shift ;
           do
             if [ "$1" = "->" ]
               then FOO=$2
                    shift $#
                    break
             fi
           done
    fi
  done
  IFS=$OIFS
  echo "$FOO"
}

useful

Kendo's picture

You could also use parameter expansions for a portable solution, but I'm not sure how efficient that would be compared to a call to sed.

Absolute path

bhoejte's picture

I have never had cause to normalize a path, but I have had cause to need an absolute path. It incidentally also seem to normalize the path as well.

Made in ksh, but seems to work in bash also. Not entirely shell script, as sed is a heavy component.

function AbsolutePath
{
  ABSFILE="${1:-$0}"
  {
  if [[ "${ABSFILE#/}" = "${ABSFILE}" ]]; then
    echo ${PWD}/${ABSFILE}
  else
    echo ${ABSFILE}
  fi
  } | sed '
    :a
    s;/\./;/;g
    s;//;/;g
    s;/[^/][^/]*/\.\./;/;g
    ta'
  unset ABSFILE
}

I'm quite certain that a better solution now exists in Linux (I mean... there must be...), but this was made 15/11-2002 on Tru64, and has worked since. Just be careful not to use an ABSFILE variable, as it will be unset at the end (not enough experience to localize it).

Too much code

Shantanu gadgil's picture

I agree with previous poster ... I have never required this ... I rely on "readlink -f"

Arrays can also be used

S. Chauveau's picture

Another way to something like that in bash is by using arrays and substrings.
I don't put the complete code but here is the idea:

# use this to split the path into parts using / as separator
OLD_IFS="$IFS" ; IFS="/" ; PARTS=($DIR) ; IFS="$OLD_IFS"

# Detect an absolute path by checking the 1st character
if [ "${DIR:0:1}" == "/" ] ; then PREFIX="/" else PREFIX="" fi

# Process the array according to your needs
nb=${#PARTS[*]}
i=0
for ((i=0;i<nb;i++)) ; do
part="${PARTS[$i]}"
if [ ... ] ; then
unset PARTS[$i]
fi
done

# and finally reconstruct the path

IFS='/' ; DIR="$PREFIX${PARTS[*]}" ; IFS="$OLD_IFS"

The advantage of that code over the 'sed' version is that it does not require to spawn any unix process.
It is pure bash.

Simpler way

Anatoli Sakhnik's picture

For existing paths it'd be simpler to use `cd "$path" && pwd` or `readlink -f "$path"`.

Thank you for this suggestion

Anonymous Coward's picture

(cd $path && pwd ) is the suggestion I needed. Thanks!

I use that, but

TesserId's picture

I've used that a lot. Sadly, it won't normalize relative paths. :(

Nice but... I cannot

Anonymous's picture

Nice but... I cannot remember when in the past 8 years of me using unix-based operating systems, the need for such a script arose. No offense, though. Probably a good code example for shell scripting and regular expressions, but kind of "solution searches for problem".

+1

Ratso's picture

My thoughts exactly. I've read the article, and then in the end asked myself - when did I ever need this is my 10+ years of linux experience? Never! How can you even end up with 'bad' paths, if not by bad scripting? To me, this article is just another 'regex show off', which is sad - I almost never ever read other people regular expressions, there's no point - I either write them off from my head, or consult man pages some more and then write them ;) To me, there's no point in reading regex - you won't understand it if you don't now regex rules, but if you know the rules, you can write regex yourself. Only when bug-hunting I read "foreign" regex.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix