Shell Functions and Path Variables, Part 3
Suppose you log on to your UNIX system and discover, for reasons beyond your control, that PATH is full of duplicate entries. (Humour me. It does happen. Maybe your system administrator modified /etc/PATH inadvisedly). Let's assume these duplicates are making your PATH undesirably long. Is there anything you can do to clean things up? Yes, you can type at the prompt:
$ uniqpath
This will remove any duplicate entries from your path, leaving the order of the remaining pathels intact. For example:
$ NEWP=fred:bill:steve:fred:dave:bill $ uniqpath -p NEWP $ echo $NEWP fred:bill:steve:daveLet's skip the options-handling code again, and look at the meat:
npath=$(listpath -p $pathvar | awk '{seen[$0]++;
if (seen[$0]==1){print}}')
eval $pathvar=$(makepath "$npath")
As usual, $pathvar contains the name of the
pathvar we want to modify. The code is rather similar to that of
delpath. The first line generates a variable
(npath) containing the unique path elements, and
the second line rebuilds the pathvar from those elements using
makepath. We don't use an external file to store the pathels, but
keep everything in shell variables. This is done in order to
demonstrate an alternative technique—there is no deeper reason.
The first line runs listpath to break the pathvar into separate lines and pipes them through an awk filter which removes duplicate pathels. You may be wondering why we don't just use the uniq program instead of awk's magic. It's because uniq will remove duplicate lines from its input only if they happen to be adjacent. In our case, the duplicate pathels will generally not be adjacent, so uniq won't work. “Aha,” you say, “why not use sort -u? That will sort the lines and remove duplicates.” True enough, however, it may also modify the directory search order, if we ran uniqpath to alter PATH. Usually, people care about the order in which their PATH directories are searched, and it's a bad idea to modify it.
Thus, we have the awk solution. This uses a powerful feature of awk known as an associative array or hash (if you have a Perl background). If you're a C programmer, you'll know what an array is: a group of objects of the same type, indexed by an integer. The contents of an array can be accessed by expressions like values[0] or values[20], which refer to the first and twenty-first elements, respectively. A hash is rather like an array which can be indexed by an arbitrary string of characters. So, in awk notation, we could write
age["bill"]=27
to assign 27 to the hash element indexed by the string bill in the hash called age. Let's look at the awk code shown above.
Between the single quotes, we have a block of code run each time awk reads a new line from its standard input. When awk reads a line, it is stored in a special variable called $0, and we use $0 as an index into a hash called seen. (We haven't declared this anywhere—that's okay in awk. Variables spring into existence, with numerical value 0, when they appear in the code). We use the seen hash to tell us whether awk has already seen an identical line of input since it started executing. Let's see what happens in the NEWP example shown above.
First, listpath splits NEWP into lines containing the following strings: “fred”, “bill”, “steve”, “fred”, “dave” and “bill”, which are read in that order by awk. awk stores each line it reads in $0, so $0 takes on the values “fred”, “bill” and so on, in turn. Each time a line is read, the corresponding element of the seen hash is incremented (by the line seen[$0]++) and is printed only if it has been seen exactly once (by the print statement in the if block, which prints $0 to standard output by default). If we look at the hash element seen["fred"], this is initially 0 and is then set to 1 when awk reads the first “fred” line, remains at 1 for the next two lines, and is set to 2 when awk reads the second “fred” line. It is printed only when it is seen for the first time. C programmers should note how syntactically elegant this solution is and how little code is required when compared to the equivalent in C.
The final pathvar function we're going to see is edpath. This breaks the pathels in a pathvar into separate lines, writes them to a temporary file and runs an editor on that file. You can edit the pathels to your heart's content and quit from the editor when you're finished. The pathvar is then reconstructed from the modified lines in the file. edpath allows you to perform arbitrary modifications on a pathvar. I use it most often when I wish to swap the order of directories in PATH.
The code for edpath is fairly straightforward (ignoring once again the boring details of option handling):
TEMP=/tmp/edpath.out.$$
VAR=\$$pathvar # VAR="$LIBPATH" for example
eval export OLD$pathvar=$VAR # store old path in
# e.g. OLDPATH
listpath -p $pathvar > $TEMP # write path
# elements to file
${EDITOR:-vi} $TEMP # edit the file eval
$pathvar=$(makepath < $TEMP) # reconstruct path
/bin/rm -f $TEMP # remove temporary file
Let's skip the first three lines for now. The real work is done by the block of code starting with listpath. This follows a similar pattern as delpath and uniqpath. First, we separate the pathels in the pathvar using listpath, but this time, we redirect the output into a temporary file. The next line edits that file. The expression ${EDITOR:-vi} may be unfamiliar; it means “Use the value of the EDITOR variable if it is non-null, else use vi.” This allows the user to specify his favourite editor by setting the EDITOR environment variable (to Emacs, perhaps) but uses vi if he has not done so. Note that the edit command is run in the foreground, so the shell will wait until the editor process terminates before running any more commands from the shell function. When this occurs, the modified pathvar will be reconstructed by the line starting with eval. If you read the description of delpath given above, you'll know how this line works.
Lines 2 and 3 of the code are a safety net. They store the initial value of the pathvar to be edited in a new environment variable. If the user is editing PATH, for example, then the code creates a variable called OLDPATH. If the user makes unwanted modifications to her PATH, she can simply type:
$ PATH=$OLDPATH
and all will be well.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- What's the tweeting protocol?
- Developer Poll
- New Products
- New Products
- Reply to comment | Linux Journal
1 hour 17 min ago - play with linux? i think you mean work-around linux
9 hours 44 min ago - Where is Epistle?
9 hours 49 min ago - You forgot OwnCloud
10 hours 19 min ago - aplikasi free
13 hours 33 min ago - Having a framework
13 hours 37 min ago - Fix my computer
14 hours 17 min ago - go-mtpfs
18 hours 24 min ago - Missed one
18 hours 43 min ago - web Host
18 hours 52 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




Comments
Source Code Availability - pathfunc.tgz
Great article! Can the code be made available?? Email will do.
Thanks,
Rick
Where's the source code?
The ftp link at the end is password protected. Shouldn't it be open to the public?