Bash: Redirecting Input from Multiple Files
Recently I needed to create a script that processed two input files. By processed I mean that the script needed to get a line from one file, then get a line from the second file, and then do something with them. Sounds easy enough, but it's not that easy unless you know about some of bash's extended redirection capabilities.
For the sake of this example, let's say that we want to implement a simple version of the paste command as a bash script. The paste command reads a line from each of its input files and then pastes them together and writes the combined result to stdout as a single line. Our example version will only do this for two input files. Plus it won't do any error checking and it will assume that the files contain the same number of lines.
Our input files, file1 and file2 are:
$ cat file1 f1 1 f1 2 f1 3 f1 4 $ cat file2 f2 1 f2 2 f2 3 f2 4
Your first thought might be something like this:
#!/bin/bash
while read f1 <$1
do
read f2 <$2
echo $f1 $f2
done
$ sh paste-bad.sh file1 file2 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 f1 1 f2 1 ... Ctrl-CThat's because each redirection here starts anew: it reopens the file and reads the first line and you get an endless loop.
Your next thought might be to read the files in one by one and then take the buffered data and paste it together afterwards:
#!/bin/bash
i=0
while read line
do
f1[$i]="$line"
let i++
done <$1
i=0
while read line
do
f2[$i]="$line"
let i++
done <$2
i=0
while [[ "${f1[$i]}" ]]
do
echo ${f1[$i]} ${f2[$i]}
let i++
done
$ sh paste-ok.sh file1 file2 f1 1 f2 1 f1 2 f2 2 f1 3 f2 3 f1 4 f2 4But if you're trying to do something more complicated than pasting lines together that approach might not be feasible and in any case it's cumbersome.
The other solution, is to use some more advanced redirection:
#!/bin/bash
while read f1 <&7
do
read f2 <&8
echo $f1 $f2
done \
7<$1 \
8<$2
In this version, at the end of the loop we specify multiple input redirections using the full general form of bash's input redirection: [n]<word. If no leading [n] is specified the default is 0, which is normal stdin redirection. However, by specifying a small integer in front of a redirection we can redirect multiple input files to the command, in this case the command is the while loop:
...
done \
7<$1 \
8<$2
This causes the "while" loop to execute with file descriptor 7 open for reading
on the first input file and file descriptor 8 open for reading on the second input file.
Normally, you should use a number larger than 2, as 0-2 are used for stdin, stdout, and stderr.
To make the read commands work we need to use a another form of bash's redirection, in this case we use bash's ability to duplicate a file descriptor (like the C library function dup2()). File descriptor duplication allows two file descriptors to refer to the same open file. Since read normally reads from stdin and not file descriptor 7 or 8 we need a way to duplicate file descriptor 7 (or 8) on stdin, bash's file descriptor duplication does just that:
while read f1 <&7
...
read f2 <&8
...
Note that read also includes a -u option for specifying
the file descriptor to read from if you prefer.
Bash contains similar forms of redirection for output files as well. See the bash man page for more information.
Mitch Frazier is an Associate Editor for Linux Journal.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Reply to comment | Linux Journal
39 min 19 sec ago - Reply to comment | Linux Journal
7 hours 33 min ago - Reply to comment | Linux Journal
7 hours 49 min ago - Favorite (and easily brute-forced) pw's
9 hours 40 min ago - Have you tried Boxen? It's a
15 hours 32 min ago - seo services in india
20 hours 4 min ago - For KDE install kio-mtp
20 hours 4 min ago - Evernote is much more...
22 hours 5 min ago - Reply to comment | Linux Journal
1 day 6 hours ago - Dynamic DNS
1 day 7 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?



Comments
Helpful
People are getting held up by the example used. Forget the example, that wasn't the point of the exercise. I googled "multiple text input files bash" for an entirely different purpose (Renaming a directory full of 1000+ files with new names with several modifications in text files).
This sample was perfect for my needs. I simply modified the example to what I was doing. Thanks to the author!
Call me ignorant but...
Why just not use:
join file1 file2.... fileN
?
Quoting from the article....
For the sake of this example, let's say that we want to implement a simple version of the paste command as a bash script.
'nuff said. :-)
exec is your friend
Nice article. People don't use input redirect nearly as often
as they should.
You can also do it using exec if you wish the newly opened
descriptors to last for the life of the shell script. Also,
the 'while' condition below will correctly handle a 'short'
second file.
The exec command can also be used to 'dup' descriptors or swap
them. Say you want to preserve the current stdout but use another
destination for regular output in sub-commands without all that
pesky redirection:
The same can be done with stdin & stderr (or any other open
file descriptor). :-)
Have fun.
while read && read
+10
"while read && read" is a good tip since many forget that after "while" (and "if") can go basically anything bash accepts as a command. (Many get fooled by syntax sugar of '[' command.)
For many year I wonder what
For many year I wonder what was used for the <& metacharacter!
Now I see that I could have used it many times instead of the awk to merge files.
Very interesting and clear explanation
Interesting ... but easy alternatives?
Always interessing but, maybe it would be easy to use other tool like 'gawk'?
In the middle example, the
In the middle example, the syntax "while [[ "${f1[$i]}" ]]" will break-out if there is a blank line in file1.
Useful things to know. :-) Nice post. TU.
True
It will, but again this is just an example, not an attempt to be robust.
Mitch Frazier is an Associate Editor for Linux Journal.