Bash: Redirecting Input from Multiple Files
October 14th, 2008 by Mitch Frazier in
Recently I needed to create a script that processed two input files. By processed I mean that the script needed to get a line from one file, then get a line from the second file, and then do something with them. Sounds easy enough, but it's not that easy unless you know about some of bash's extended redirection capabilities.
For the sake of this example, let's say that we want to implement a simple version of the paste command as a bash script. The paste command reads a line from each of its input files and then pastes them together and writes the combined result to stdout as a single line. Our example version will only do this for two input files. Plus it won't do any error checking and it will assume that the files contain the same number of lines.
Our input files, file1 and file2 are:
$ cat file1 f1 1 f1 2 f1 3 f1 4 $ cat file2 f2 1 f2 2 f2 3 f2 4
Your first thought might be something like this:
#!/bin/bash
while read f1 <$1
do
read f2 <$2
echo $f1 $f2
done
$ sh paste-bad.sh file1 file2 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 ... Ctrl-CThat's because each redirection here starts anew: it reopens the file and reads the first line and you get an endless loop.
Your next thought might be to read the files in one by one and then take the buffered data and paste it together afterwards:
#!/bin/bash
i=0
while read line
do
f1[$i]="$line"
let i++
done <$1
i=0
while read line
do
f2[$i]="$line"
let i++
done <$2
i=0
while [[ "${f1[$i]}" ]]
do
echo ${f1[$i]} ${f2[$i]}
let i++
done
$ sh paste-ok.sh file1 file2 f1 1 f2 1 f1 2 f2 2 f1 3 f2 3 f1 4 f2 4But if you're trying to do something more complicated than pasting lines together that approach might not be feasible and in any case it's cumbersome.
The other solution, is to use some more advanced redirection:
#!/bin/bash
while read f1 <&7
do
read f2 <&8
echo $f1 $f2
done \
7<$1 \
8<$2
In this version, at the end of the loop we specify multiple input redirections using the full general form of bash's input redirection: [n]<word. If no leading [n] is specified the default is 0, which is normal stdin redirection. However, by specifying a small integer in front of a redirection we can redirect multiple input files to the command, in this case the command is the while loop:
...
done \
7<$1 \
8<$2
This causes the "while" loop to execute with file descriptor 7 open for reading
on the first input file and file descriptor 8 open for reading on the second input file.
Normally, you should use a number larger than 2, as 0-2 are used for stdin, stdout, and stderr.
To make the read commands work we need to use a another form of bash's redirection, in this case we use bash's ability to duplicate a file descriptor (like the C library function dup2()). File descriptor duplication allows two file descriptors to refer to the same open file. Since read normally reads from stdin and not file descriptor 7 or 8 we need a way to duplicate file descriptor 7 (or 8) on stdin, bash's file descriptor duplication does just that:
while read f1 <&7
...
read f2 <&8
...
Note that read also includes a -u option for specifying
the file descriptor to read from if you prefer.
Bash contains similar forms of redirection for output files as well. See the bash man page for more information.
__________________________Mitch Frazier is an Associate Editor at Linux Journal.
Special Magazine Offer -- 2 Free Trial Issues!
Receive 2 free trial issues of Linux Journal as well as instant online access to current and past issues. There's NO RISK and NO OBLIGATION to buy. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Sorry, offer available in the US only. International orders, click here.
Subscribe now!
The Latest
Featured Videos
Linux Journal Live - eBook Readers and DRM
November 14th, 2008 by Shawn Powers in
The November 13, 2008 edition of Linux Journal Live! Shawn Powers and special guest, Linux Journal Author Daniel Bartholomew, talk e-book readers and Daniel's Kindle, DRM, and other goodness.
Run Your Windows Partition Without Rebooting
November 13th, 2008 by Elliot Isaacson in
Dual booting is a necessary evil and very inconvenient. What if you could run your windows partition in a virtual machine, so you wouldn't have to worry about rebooting anymore? With VMWare Workstation, you can.
Recently Popular
From the Magazine
December 2008, #176
The Oxford English Dictionary says the word "gadget" is a placeholder name for a technical item whose precise name one can't remember. Like that book-reader thingy from Amazon...what's it called? Spindle, Gindle...Kindle, that's it. Check it out in this month's gadget issue.
Other gadgets covered include the Nokia tablets, the BlackBerry, the Neo FreeRunner, the Dash Express, the Roku Netflix Player, the Kangaroo TV, The TomTom GO 930 and the MooBella Ice Cream System. On the larger hardware front, read the reviews of the Acer Aspire One and the YDL PowerStation. On the software front, check out the articles and columns on memcached, Samba security, Mutt, desktop gadgets, bash and Puppet. To wrap it all up, read Doc's thoughts on Google and the browser platform.
Delicious
Digg
Reddit
Newsvine
Technorati







Call me ignorant but...
On October 16th, 2008 Anonymous (not verified) says:
Why just not use:
join file1 file2.... fileN
?
Quoting from the article....
On October 16th, 2008 Boscorama (not verified) says:
For the sake of this example, let's say that we want to implement a simple version of the paste command as a bash script.
'nuff said. :-)
exec is your friend
On October 15th, 2008 Boscorama (not verified) says:
Nice article. People don't use input redirect nearly as often
as they should.
You can also do it using exec if you wish the newly opened
descriptors to last for the life of the shell script. Also,
the 'while' condition below will correctly handle a 'short'
second file.
The exec command can also be used to 'dup' descriptors or swap
them. Say you want to preserve the current stdout but use another
destination for regular output in sub-commands without all that
pesky redirection:
The same can be done with stdin & stderr (or any other open
file descriptor). :-)
Have fun.
while read && read
On October 17th, 2008 Anonymous (not verified) says:
+10
"while read && read" is a good tip since many forget that after "while" (and "if") can go basically anything bash accepts as a command. (Many get fooled by syntax sugar of '[' command.)
For many year I wonder what
On October 15th, 2008 cbm (not verified) says:
For many year I wonder what was used for the <& metacharacter!
Now I see that I could have used it many times instead of the awk to merge files.
Very interesting and clear explanation
Interesting ... but easy alternatives?
On October 15th, 2008 asturbcn (not verified) says:
Always interessing but, maybe it would be easy to use other tool like 'gawk'?
In the middle example, the
On October 14th, 2008 Russell2 (not verified) says:
In the middle example, the syntax "while [[ "${f1[$i]}" ]]" will break-out if there is a blank line in file1.
Useful things to know. :-) Nice post. TU.
True
On October 15th, 2008 Mitch Frazier says:
It will, but again this is just an example, not an attempt to be robust.
__________________________Mitch Frazier is an Associate Editor at Linux Journal.
Post new comment