Reading Multiple Files with Bash


Reading files is no big deal with bash: you just redirect the input to the script or pipe the output of another command into the script, or you could do it inside the script if the file names are pre-determined. You could also use process substitution to pass in the open files (command pipelines actually) from the command line. Another option, the one I describe here, is to just open the files and read (or write) them as you like, as you'd do in other programming languages.

The mechanism used here takes advantage of bash's ability to redirect input (or output) using a specific file descriptor with the following syntax:


The "n" here is a small integer that specifies the file descriptor to use to open the named file. If no "n" is specified then the following defaults apply:

<file           # same as 0<file
>file           # same as 1>file
>>file          # same as 1>>file
<>file          # same as 0<>file

This is of course the standard redirection stuff that is used all the time.

So, given that the "n" is there, it would seem that one could easily open files as needed and process them as needed. How to actually do it though is less than obvious, but it's actually quite simple:

exec 7<file1
exec 8<file2

This opens file1 on file descriptor 7 for input, and file2 on file descriptor 8. Now we can read them easily with:

read data1 <&7
read data2 <&8

Notice the input redirection to read uses another special form that includes the ampersand (&) to specify that what follows is a file descriptor and not a file name.

Use file descriptors in the range 3-9. File descriptors below 3 are used for standard input, output, and error, the ones above 9 may be used by the shell internally.

Although there is no explicit syntax for closing a file, re-using the file descriptor will close the file before opening the new file. (08/21/2009: this is incorrect, there is a syntax for closing files, see the comments below --Mitch)

To be safe you could do the following to close the files:

exec 7</dev/null
exec 8</dev/null

The reason for the exec is so that the opening of the file is done in the current shell and not in a sub-shell, which would close the file descriptor as soon as the command completed (not that it would be available in the calling shell anyways). It may also surprise you that n<file by itself is not a syntax error, but it's not.

An example of doing all this follows:


function readfiles()
	local FD1=7
	local FD2=8
	local file1=$1
	local file2=$2
	local count1=0
	local count2=0
	local eof1=0
	local eof2=0
	local data1
	local data2

	# Open files.
	# ***** 08/22/2009: See comments below for a way to avoid    *****
	# *****             hardcoding the file descriptors -- Mitch *****
	exec 7<$file1
	exec 8<$file2

	while [[ $eof1 -eq 0  ||  $eof2 -eq 0 ]]
		if read data1 <&$FD1; then
			let count1++
			printf "%s, line %d: %s\n" $file1 $count1 "$data1"
		if read data2 <&$FD2; then
			let count2++
			printf "%s, line %d: %s\n" $file2 $count2 "$data2"

echo "Reading file1 and file2"
readfiles file1 file2

echo "Reading file3 and file4"
readfiles file3 file4

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

The function at the top reads the files, the main code processes 2 files, then processes 2 different files. Running the command produces:

$ bash
Reading file1 and file2
file1, line 1: f1 line 1
file2, line 1: f2 line 1
file1, line 2: f1 line 2
file2, line 2: f2 line 2
file1, line 3: f1 line 3
file2, line 3: f2 line 3
file1, line 4: f1 line 4
file2, line 4: f2 line 4
file1, line 5: f1 line 5
file2, line 5: f2 line 5
file1, line 6: f1 line 6
Reading file3 and file4
file3, line 1: f3 line 1
file4, line 1: f4 line 1
file3, line 2: f3 line 2
file4, line 2: f4 line 2
file3, line 3: f3 line 3

A similar process can be used for writing multiple output files using the n>file or n>>file syntax. A possible time saver if you're writing a lot of data to the same file in many different places in your script.

readmult.tgz634 bytes

Mitch Frazier is an Associate Editor for Linux Journal.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

No mention in the manual? Huh?

Anonymous's picture

You are aware that the documentation for Bash is in Texinfo?

And even then, the manual page says:

Duplicating File Descriptors
The redirection operator


is used to duplicate input file descriptors. If word expands to one or
more digits, the file descriptor denoted by n is made to be a copy of
that file descriptor. If the digits in word do not specify a file
descriptor open for input, a redirection error occurs. If word evalu‐
ates to -, file descriptor n is closed.


Mitch Frazier's picture

Guess I missed that part. Now that I check the man page a bit closer, I see that it is in there. Although it doesn't explicitly state that "[n]>&-" closes "n", which is what the comment below referred to, although that does appear to work also.

Mitch Frazier is an Associate Editor for Linux Journal.

From the Man Page

Mitch Frazier's picture

As a reference, the man page describes what exec is doing:

exec [-cl] [-a name] [command [arguments]]
[...removed...] If command is not specified, any redirections take effect in the current shell, and the return status is 0. If there is a redirection error, the return status is 1.

Mitch Frazier is an Associate Editor for Linux Journal.

exec syntax

augmentedfourth's picture

Could you have written:

exec 7<$file1


exec $FD1<$file1


Why would you hard-code the file descriptor value in the exec line when it was already defined in a variable? Does the exec command not like that for some reason?

It Doesn't Like It

Mitch Frazier's picture

Unfortunately that doesn't work, you get:

Reading file1 and file2 line 17: exec: 7: not found

Exec does the substitution of the value of $FD1 but it doesn't then reparse that to see if it's an integer, rather it assumes it's a command. The message is saying that the command 7 is not found.

Mitch Frazier is an Associate Editor for Linux Journal.

Just As I Wrote That...

Mitch Frazier's picture

Just as I hit submit I realized how to make that work:

    # Open files.
    eval exec "$FD1<$file1"
    eval exec "$FD2<$file2"

Mitch Frazier is an Associate Editor for Linux Journal.


Anonymous's picture

Comparing the use of file descriptors to standard redirection:

Closing file descriptors

Anonymous's picture

There is an explicit syntax for closing a file descriptor. If you want to close descriptor 7:

exec 7>&-

I recommend avoiding this with descriptors 0-2, since many programs will behave erratically if run with these descriptors closed.


Mitch Frazier's picture

Thanks. That does appear to do what you describe. You can test it with the script:


exec 7>junk
echo JUNK >&7
lsof -p $$ | grep -v mem
exec 7>&-
lsof -p $$ | grep -v mem
Which should produce something like:
sh      27781 mitch  cwd    DIR    9,0    20480 12517486 /home/mitch/tmp
sh      27781 mitch  rtd    DIR    9,0     4096        2 /
sh      27781 mitch  txt    REG    9,0   725048  5316638 /bin/bash
sh      27781 mitch    0u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch    1u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch    2u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch    7w   REG    9,0        5 12517390 /home/mitch/tmp/junk
sh      27781 mitch  255r   REG    9,0       99 12519090 /home/mitch/tmp/

sh      27781 mitch  cwd    DIR    9,0    20480 12517486 /home/mitch/tmp
sh      27781 mitch  rtd    DIR    9,0     4096        2 /
sh      27781 mitch  txt    REG    9,0   725048  5316638 /bin/bash
sh      27781 mitch    0u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch    1u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch    2u   CHR  136,1      0t0        3 /dev/pts/1
sh      27781 mitch  255r   REG    9,0       99 12519090 /home/mitch/tmp/

As you can see, in the second output from lsof the file on file descriptor 7 is now closed.

Most interesting about this is that it doesn't appear to be in the man page anywhere, closest thing I see is:

Similarly, the redirection operator
moves the file descriptor digit to file descriptor n, or the standard output (file descriptor 1) if n is not specified.

So I guess it's a special form of that. If you check back, leave a note as to where you found that documented. Thanks again.

Mitch Frazier is an Associate Editor for Linux Journal.

It's also more efficient

Anonymous's picture

Just an additional note... Using file descriptors is also more efficient (reducing processing time by about 5x). This is because the file is not opened/closed implicitly between operations.

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
On Demand
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot