For this purpose the special metacharacter \ is used to escape metacharacters. Escaped metacharacters are interpreted literally, not as a component of an expression. Therefore \[ would match any sequence with a [ in it:
$ grep '[' CREDITS
produces an error message:
grep: Unmatched [ or [^
$ grep '\[' CREDITS
produces two lines:
E: email@example.com [My uucp-fed Linux box at home] D: The XFree86[tm] Project
If you need to search for a \ character, escape it just like any other metacharacter: \\
As you can see, with just its support of regular expression syntax, grep provides us with some very powerful capabilities. Its command line options add even more power.
Sometimes you are looking for a string, but don't know whether it is upper, lower, or mixed case. For this situation grep offers the -i switch. With this option, case is completely ignored:
$ grep -i lINuS CREDITS Linus N: Linus Torvalds E: Linus.Torvalds@Helsinki.FI D: Personal information about Linus
The -v option causes grep to print all lines that do not contain the specified regular expression:
$ grep -v '^#" /etc/syslog.conf | grep -v '^$'
prints all the lines from /etc/syslog.com that are neither commented (starting with #) nor empty (^$). This prints six lines on my system, although my syslog.conf file really has 21 lines.
If you need to know how many lines match, pass grep the -c option. This will output the number of matching lines (not the number of matches; two matches in one line count as one) without printing the lines that match:
$ grep -c Linux CREDITS 33
If you are searching for filenames that contain a given string, instead of the actual lines that contain it, use grep's -l switch:
$ grep -l Linux * CREDITS README README.modules
grep also notifies us, for each subdirectory, that it can't search through a directory. This is normal and will happen whenever you use a wildcard that happens to include directory names as well as file names.
The opposite of -l is -L. This option will cause grep to return the names of files that do not contain the specified pattern.
If you are searching for a word and want to suppress matches that are partial words use the -w option. Without the -w option,
$ grep -c a README
tells us that it matched 146 lines, but
$ grep -wc a README
returns only 35 since we matched only the word a, not every line with the character a.
Two more useful options:
$ grep -b Linus CREDITS 301: Linus 17446:N: Linus Torvalds 17464:E: Linus.Torvalds@Helsinki.FI 20561:D: Personal information about Linus $ grep -n Linus CREDITS 7: Linus 793:N: Linus Torvalds 794:E: Linus.Torvalds@Helsinki.FI 924:D: Personal information about Linus
The -b option causes grep to print the byte offset (how many bytes the match is from the beginning of the file) of each match before the corresponding line of output. The -n switch gives the line number.
GNU also provides egrep (enhanced grep). The regular expression syntax supported by GNU egrep adds a few other metacharacters:
? Like *, except that it matches zero or one instances instead of zero or more.
+ the preceding character is matched one or more times.
| separates regular expressions by ORing them together.
$ egrep -i 'linux|linus' CREDITS
outputs any line that contains linus or linux.
To allow for legibility, parentheses “(” and “)” can be used in conjunction with “|” to separate and group expressions.
This covers many of the features provided by grep. If you look at the manual page, which I strongly recommend, you will see that I did leave out some command-line options, such as different ways to format grep's output and a method for searching for strings without employing regular expressions.
Learning how to use these powerful tools provides Linux users with two very valuable advantages. The first (and most immediate) of these is a time-saving way to process files and output from other commands.
The second is familiarity with regular expressions. Regular expressions are used throughout the Unix world in tools such as find and sed and languages such as awk, perl and Tcl. Learning this syntax prepares you to use some of the most powerful computing tools available.
Eric Goedelbecker is a systems analyst for Reuters America, Inc. He supports clients (mostly financial institutions) who use market data retrieval and manipulation APIs in trading rooms and back office operations. In his spare time (about 15 minutes a week...), he reads about philosophy and hacks around with Linux. He can be reached via e-mail at firstname.lastname@example.org.
Webinar: 8 Signs You’re Beyond Cron
11am CDT, April 29th
Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.Join us!
- March 2015 Issue of Linux Journal: High-Performance Computing
- New Products
- Not So Dynamic Updates
- Users, Permissions and Multitenant Sites
- Flexible Access Control with Squid Proxy
- Security in Three Ds: Detect, Decide and Deny
- April 2015 Video Preview
- Tighten Up SSH
- DevOps: Everything You Need to Know
- Non-Linux FOSS: MenuMeters