Work the Shell - Scripting Common File Rename Operations

 in
If you find yourself always typing the same set of commands, it's time to write a script. This month, it's a script to rename and renumber files.

I'm guessing that each of us uses the command line differently and seeks to accomplish different tasks. Mine are sometimes very specialized, like the script I wrote that lets me easily transform the unique filenames from the Mac OS X built-in screen-capture utility into a Web-friendly format.

In the past few weeks, I realized I needed another fairly specialized script for file renaming, but this time, I wanted to write something as generally useful as possible.

There's already a utility included in some flavors of Linux called rename, but, alas, I couldn't find it on my Linux/NetBSD systems. If you have it, it probably duplicates the functionality I create this month. Still, read on. Hopefully, this'll be useful and interesting!

Rename/Pattern/Newpattern

It's surprising how often I find myself on the command line typing in something like:

for name in xx*
do
    new="$(echo $name | sed 's/xx/yy/')"
    mv $name $new
done

So, that's the first part of the script I want to create, one that lets me just specify the OLD and NEW filename patterns, then simply renames all files matching “OLD” with the “NEW” pattern substituted.

For example, say I have test-file-1.txt and test-file-2.jpg and want to replace “test-file” with “demo”. The goal is to have an invocation like:

rename test-file demo

and have it do all the work for me. Sound good?

How Many Matching Files?

The first step is actually the most difficult: matching an arbitrary pattern and catching any possible error conditions gracefully. The loop is going to end up looking like this:

for name in $1*

If there aren't any matches, however, you get an ugly error message and the script looks amateurish. So, the goal is actually to ascertain before the for loop how many matches there are to that given pattern.

Ah, okay, so ls $1* | wc -l does the trick, right? Nope, that'll still generate the same ugly error message.

Fortunately, there's a way in Bash that you can redirect stderr to go to stdout (that is, to have your error messages appear as standard messages, able to be rerouted, piped and so on).

The test for the number of matches, thus, can be done like this:


matches="$(ls -1 $1* 2>&1 | wc -l)"

I know, it's complicated. Worse, a quick test reveals that when there are zero matches, ls -l actually generates an error message: ls: No such file or directory. That's not good. The solution? Add a grep to the sequence:


matches="$(ls -1 $1* 2>&1 | grep -v "No such file" | wc -l)"

That's even more complicated, but it works exactly as we'd like. “matches” is zero in the situation where there aren't any matches; otherwise, it has the number of matching files and folders for the given pattern.

A test now lets us produce a meaningful and informative error message:

if [ $matches -eq 0 ] ; then
    echo "Error: no files match pattern $1*"
    exit 0
fi

Because we're looking at stderr versus stdout, we also could more properly route that error message to stderr with >&2, and to be totally correct, we should exit with a nonzero error code to indicate that the script failed to execute properly. I'll leave those tweaks as an exercise for the reader.

Now that we know we'll never hit the for loop without at least one match, the core code is straightforward:

for name in $1*
do
    new="$(echo $name | sed "s/$1/$2/")"
    mv $name $new
done

Notice in this instance that you can't use the single quotes within the $( ) command substitution; if you do, $1 and $2 won't be expanded properly.

We certainly could just stop here and have a useful little script, but I'm into wicked cool scripts, so let's push on, shall we?

Sequential File Numbering

The other feature I constantly find myself needing is the ability to number a series of files sequentially. For example, a final set of photos from a photo shoot might be DSC1017, DSC1019, DSC1023 and DSC1047. It would be more useful to be able to renumber those before sending them to a client, so that they're DSC-1, DSC-2, DSC-3 and so on.

This is pretty easily accomplished too, now that we have a script that renames a sequence of files. Here's how I accomplish it in the script itself:

if [ $renumber -eq 1 ] ; then
    suffix="$(echo $name | cut -d. -f2- | tr '[A-Z]' '[a-z]')"
    new="$2$count.$suffix"
    count=$(( $count + 1 ))
    mv $name $new
    chmod a+r $new
fi

Here I am expecting to replace the entire filename, so I strip out and save the filename suffix (for example, DSC1015.JPG becomes JPG), so I can re-attach it later. While I'm at it, filename suffixes also are normalized to all lowercase using the handy tr command.

The count variable keeps track of what number we're on, and notice the built-in shell notation of $(( )) for mathematical calculations.

Finally, the new filename is built from the new pattern ($2), plus the count ($count), plus the filename suffix ($suffix) in this line:

new="$2$count.$suffix"

The two conditions need to be merged, however, so the final script ends up with an if-then-else-fi structure.

I can't leave well enough alone, so I continued to tweak the script by adding a few starting flags too. To parse it all, our friend getopt is utilized:

args=$(getopt npt $*)

if [ $? != 0 -o $# -lt 2 ] ; then
    echo "Usage: $(basename $0) {-p} {-n} {-t} PATTERN NEWPATTERN"
    echo "
    echo " -p  rewrites PNG to png"
    echo " -n  sequentially numbers matching files with"
    echo "     NEWPATTERN as base filename"
    echo " -t  test mode: show what you'll do, don't do it."
    exit 0
fi

set -- $args
for i
do
    case "$i" in
    -n ) renumber=1 ; shift ;;
    -p ) fixpng=1   ; shift ;;
    -t ) doit=0     ; shift ;;
    -- ) shift      ; break ;;
fi

I've written about getopt and its complicated usage in shell scripts before if you want to read up on it [see “Parsing Command-Line Options with getopt” in the July 2009 issue of LJ, www.linuxjournal.com/article/10495]. Note that three flags are available to the script user: -n invokes the renumbering capability (which means the filenames are discarded, remember); -p is a special case where .PNG also is rewritten as .png; and -t is a sort of “echo-only” mode where the rename doesn't actually happen, the script just shows what it would do based on the patterns given.

How am I using it now? Like this:

rename -n IMG_ iphone-copy-paste-

Every matching .PNG file (IMG_*) has that portion of its name replaced with “iphone-copy-paste-”, and as it proceeds, “PNG” is also rewritten as “png”.

The entire rename script can be found on the Linux Journal FTP server at ftp.linuxjournal.com/pub/lj/listings/issue199/10885.tgz.

Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

______________________

Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A few problems

Keith Daniels's picture

Hi Dave

I downloaded the script and it would not run. There were several syntax errors:

Line 14 was missing a double quote.
Line 30 had fi instead of esac (the case statement ending syntax)
Missing done after Line 30

Also, there were two other issues with using the script in renumbering mode, that you did not mention.

ONE - If there is no filename extension, then the original filename will be used as an extension name. For example, junk-1 through junk-11 will be renamed to:

newjunk1.junk-1
newjunk10.junk-9
newjunk2.junk-10
newjunk3.junk-11
newjunk4.junk-3
newjunk5.junk-4
newjunk6.junk-5
newjunk7.junk-6
newjunk8.junk-7
newjunk9.junk-8

Note the contents of the files (illustrated by the new extension) are no longer in the orignial sequence after being renamed.

Two - If there is a digit change in the original numbering sequence (ie. 1 digit to two or three digits to four) the contents of the renamed files will no longer be in the same sorting sequences as the original files.

If all numbers in the file name have the same number of digits then there is no problem with the contents of the files remaining in the same sorting sequence. If the original files above had a 0 in front of the single digit numbers there would not have been a problem with the rename.

Anyone using the script should be aware of this sorting sequence issue. Since mv is used to rename the files, and if the sorting sequence is important to you, the script could create a big mess that would be difficult or at least time consuming to restore to the original sorting sequence.

You could replace mv with cp and delete the original files if the script produced what you wanted or if it did not you could delete the new filenames and try again.

Other than all that... it worked fine.

"I have always wished that my computer would be as easy to use as my telephone.
My wish has come true. I no longer know how to use my telephone."
-- Bjarne Stroustrup

rename script

Jean-Pierre's picture

ls -1 $1* will list directories and will descend in subdirectories. In some situations this may well produce unexpected results.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix