Treating Compressed and Uncompressed Data Sources the Same
on December 19, 2008
Occasionally, you need to process a number of files—some of which have been compressed and some which have not (think log files). Rather than running two variations, one compressed and one not, wrap it in a bash function:
function data_source ()
{
local F=$1
# strip the gz if it's there
F=$(echo $F | perl -pe 's/.gz$//')
if [[ -f $F ]] ; then
cat $F
elif [[ -f $F.gz ]] ; then
nice gunzip -c $F
fi
}
which nicely allows:
for file in * ; do data_source $file | ... done
Whether you're dealing with gzip'd files or uncompressed, you no longer have to treat them differently mentally. With a little more effort, bzip files also could be detected and handled.
