The Many Paths to a Solution

That's why if you're going to do anything with sed, it's critical to know its -n flag, which surpasses its desire to output every line it reads. Now here's a working command:


$ sed -n '12,14p' wonderland.txt

Alice was beginning to get very tired of sitting by her
sister on the bank, and of having nothing to do: once

Can you see how to chain these together? It all can be done in a simple for loop (particularly if you ignore error checking for now). But again, there's another small step required: the line count n prior and n subsequent to the matching line n need to be calculated. That's easy math:


before=$(( $match - $context ))
after=$(( $match + $context ))

Here context specifies whether you want 1, 2, 3 or more lines of context above and below the matching line.

Let's give this a whirl:


#!/bin/sh
# wegrep - grep with context and regular expressions
grep=/usr/bin/grep
sed=/usr/bin/sed
if [ $# -ne 2 ] ; then
  echo "Usage: wegrep [pattern] filename" ; exit 1
fi
for match in $($grep -n -E "$1" "$2" | cut -d: -f1)
do
  before=$(( $match - $context ))
  after=$(( $match + $context ))
  $sed -n '${before},${after}p' "$2"
done
exit 0

Except it turns out that there are two critical bugs in the above code, as is immediately apparent when you run your first test:


$ sh wegrep '^Alice' wonderland.txt

wegrep: line 14: 13:Alice -  : syntax error in expression
 ↪(error token is ":Alice -  ")

Can you see the first bug? Line 14 is the calculation for the variable before.

So what's wrong? You need to initialize context with a value, so the mathematical expression is essentially:


15 +

Which is correctly flagged as an error. Easily fixed.

The second bug is more subtle, however, but here's the clue when you run the script with context defined as 1 near the top of the script:


$ sh wegrep '^Alice' wonderland.txt
sed: 1: "${before},${after}p": unexpected EOF (pending }'s)
sed: 1: "${before},${after}p": unexpected EOF (pending }'s)

That's definitely odd. It's sed that's complaining, but what's wrong with the line that invokes sed?

Let's have another look at that line:


$sed -n '${before},${after}p' "$2"

Now can you see the error? It's a subtle and common problem in shell scripts: I'm using the wrong quotation marks. Remember, in a shell script, single quotation marks prevent the interpretation of variables. Switch it to double quotation marks, and everything now works great:


$ sh wegrep '^Alice' wonderland.txt

Alice was beginning to get very tired of sitting by her
sister on the bank, and of having nothing to do: once
There was nothing so very remarkable in that; nor did
Alice think it so very much out of the way to hear the
Rabbit say to itself, 'Oh dear! Oh dear! I shall be

Now another problem rears its head: how do you differentiate between blocks that have matched? Easy, add - - - - before and after each match by adding a few echo statements to the for loop:


for match in $($grep -n -E "$1" "$2" | cut -d: -f1)
do
  before=$(( $match - $context ))
   after=$(( $match + $context ))
  echo "-----"
  sed -n "${before},${after}p" "$2"
  echo "-----"
done

This works, but it's a bit clunky as output goes, although it pretty closely matches what modern grep does with the -C flag:


$ sh wegrep '^Alice' wonderland.txt
-----

Alice was beginning to get very tired of sitting by her
sister on the bank, and of having nothing to do: once
-----
-----
There was nothing so very remarkable in that; nor did
Alice think it so very much out of the way to hear the
Rabbit say to itself, 'Oh dear! Oh dear! I shall be
-----

As a purist, I'd much rather have one dashed line between output blocks, one before the first match and one after the last, with no doubling of lines.

That's not hard to do, and there's a second task of adding back line numbers and ideally denoting which line has the match to the regular expression. But I'm out of room, so those tasks will have to wait until another day.

______________________

Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.