Using grep

New Linux users unfamiliar with this standard Unix tool may not realize how useful it is. In this tutorial for the novice user, Eric demonstrates grep techniques.
Matching Metacharacters

For this purpose the special metacharacter \ is used to escape metacharacters. Escaped metacharacters are interpreted literally, not as a component of an expression. Therefore \[ would match any sequence with a [ in it:

$ grep '[' CREDITS

produces an error message:

grep: Unmatched [ or [^

but

$ grep '\[' CREDITS

produces two lines:

E: hennus@sky.ow.nl [My uucp-fed Linux box at home]
D: The XFree86[tm] Project

If you need to search for a \ character, escape it just like any other metacharacter: \\

Options

As you can see, with just its support of regular expression syntax, grep provides us with some very powerful capabilities. Its command line options add even more power.

Sometimes you are looking for a string, but don't know whether it is upper, lower, or mixed case. For this situation grep offers the -i switch. With this option, case is completely ignored:

$ grep -i lINuS CREDITS
                        Linus
N: Linus Torvalds
E: Linus.Torvalds@Helsinki.FI
D: Personal information about Linus

The -v option causes grep to print all lines that do not contain the specified regular expression:

$ grep -v '^#" /etc/syslog.conf | grep -v '^$'

prints all the lines from /etc/syslog.com that are neither commented (starting with #) nor empty (^$). This prints six lines on my system, although my syslog.conf file really has 21 lines.

If you need to know how many lines match, pass grep the -c option. This will output the number of matching lines (not the number of matches; two matches in one line count as one) without printing the lines that match:

$ grep -c Linux CREDITS
33

If you are searching for filenames that contain a given string, instead of the actual lines that contain it, use grep's -l switch:

$ grep -l Linux *
CREDITS
README
README.modules

grep also notifies us, for each subdirectory, that it can't search through a directory. This is normal and will happen whenever you use a wildcard that happens to include directory names as well as file names.

The opposite of -l is -L. This option will cause grep to return the names of files that do not contain the specified pattern.

If you are searching for a word and want to suppress matches that are partial words use the -w option. Without the -w option,

$ grep -c a README

tells us that it matched 146 lines, but

$ grep -wc a README

returns only 35 since we matched only the word a, not every line with the character a.

Two more useful options:

$ grep -b Linus CREDITS
301:                    Linus
17446:N: Linus Torvalds
17464:E: Linus.Torvalds@Helsinki.FI
20561:D: Personal information about Linus
$ grep -n Linus CREDITS
7:                      Linus
793:N: Linus Torvalds
794:E: Linus.Torvalds@Helsinki.FI
924:D: Personal information about Linus

The -b option causes grep to print the byte offset (how many bytes the match is from the beginning of the file) of each match before the corresponding line of output. The -n switch gives the line number.

Another grep

GNU also provides egrep (enhanced grep). The regular expression syntax supported by GNU egrep adds a few other metacharacters:

  • ? Like *, except that it matches zero or one instances instead of zero or more.

  • + the preceding character is matched one or more times.

  • | separates regular expressions by ORing them together.

$ egrep -i 'linux|linus' CREDITS

outputs any line that contains linus or linux.

To allow for legibility, parentheses “(” and “)” can be used in conjunction with “|” to separate and group expressions.

More Than Just grep

This covers many of the features provided by grep. If you look at the manual page, which I strongly recommend, you will see that I did leave out some command-line options, such as different ways to format grep's output and a method for searching for strings without employing regular expressions.

Learning how to use these powerful tools provides Linux users with two very valuable advantages. The first (and most immediate) of these is a time-saving way to process files and output from other commands.

The second is familiarity with regular expressions. Regular expressions are used throughout the Unix world in tools such as find and sed and languages such as awk, perl and Tcl. Learning this syntax prepares you to use some of the most powerful computing tools available.

Eric Goedelbecker is a systems analyst for Reuters America, Inc. He supports clients (mostly financial institutions) who use market data retrieval and manipulation APIs in trading rooms and back office operations. In his spare time (about 15 minutes a week...), he reads about philosophy and hacks around with Linux. He can be reached via e-mail at eric@nymt.reuter.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Some confuusion while searching using grep

Vinod Semwal's picture

I hav a dir and on that file present like this

-rw-r--r-- 1 vs users 1390254 Jun 2 01:00 abcdefg.1006010100.dat
-rw-r--r-- 1 vs users 1388800 Jun 3 01:00 abcdefg.1006020100.dat
-rw-r--r-- 1 vs users 1388555 Jun 4 01:00 abcdefg.1006030100.dat
-rw-r--r-- 1 vs users 1392184 Jun 5 01:00 abcdefg.1006040100.dat
-rw-r--r-- 1 vs users 1391747 Jun 6 01:00 abcdefg.1006050100.dat
-rw-r--r-- 1 vs users 1392099 Jun 7 01:00 abcdefg.1006060100.dat
-rw-r--r-- 1 vs users 1389362 Jun 8 01:00 abcdefg.1006070100.dat
-rw-r--r-- 1 vs users 1392676 Jun 9 01:00 abcdefg.1006080100.dat
-rw-r--r-- 1 vs users 1436696 Jun 10 01:00 abcdefg.1006090100.dat
-rw-r--r-- 1 vs users 1060539 Jun 10 18:39 abcdefg.1006100100.dat

please check output while i am using grep command in below manner ..

vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep abc*
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep "abc*"| tail
-rw-r--r-- 1 vs users 1390254 Jun 2 01:00 abcdefg.1006010100.dat
-rw-r--r-- 1 vs users 1388800 Jun 3 01:00 abcdefg.1006020100.dat
-rw-r--r-- 1 vs users 1388555 Jun 4 01:00 abcdefg.1006030100.dat
-rw-r--r-- 1 vs users 1392184 Jun 5 01:00 abcdefg.1006040100.dat
-rw-r--r-- 1 vs users 1391747 Jun 6 01:00 abcdefg.1006050100.dat
-rw-r--r-- 1 vs users 1392099 Jun 7 01:00 abcdefg.1006060100.dat
-rw-r--r-- 1 vs users 1389362 Jun 8 01:00 abcdefg.1006070100.dat
-rw-r--r-- 1 vs users 1392676 Jun 9 01:00 abcdefg.1006080100.dat
-rw-r--r-- 1 vs users 1436696 Jun 10 01:00 abcdefg.1006090100.dat
-rw-r--r-- 1 vs users 1059029 Jun 10 18:37 abcdefg.1006100100.dat
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep abc| tail
-rw-r--r-- 1 vs users 1390254 Jun 2 01:00 abcdefg.1006010100.dat
-rw-r--r-- 1 vs users 1388800 Jun 3 01:00 abcdefg.1006020100.dat
-rw-r--r-- 1 vs users 1388555 Jun 4 01:00 abcdefg.1006030100.dat
-rw-r--r-- 1 vs users 1392184 Jun 5 01:00 abcdefg.1006040100.dat
-rw-r--r-- 1 vs users 1391747 Jun 6 01:00 abcdefg.1006050100.dat
-rw-r--r-- 1 vs users 1392099 Jun 7 01:00 abcdefg.1006060100.dat
-rw-r--r-- 1 vs users 1389362 Jun 8 01:00 abcdefg.1006070100.dat
-rw-r--r-- 1 vs users 1392676 Jun 9 01:00 abcdefg.1006080100.dat
-rw-r--r-- 1 vs users 1436696 Jun 10 01:00 abcdefg.1006090100.dat
-rw-r--r-- 1 vs users 1060539 Jun 10 18:39 abcdefg.1006100100.dat
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep abc.*| tail -2
-rw-r--r-- 1 vs users 1436696 Jun 10 01:00 abcdefg.1006090100.dat
-rw-r--r-- 1 vs users 1060539 Jun 10 18:39 abcdefg.1006100100.dat
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep *abc| tail -2
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp > ls -ltr | grep *.abc| tail -2
vs@vodalksmvs2 /var/opt/nokia/smvs/tmp >

I want to know in what manner grep runs and giving output for above commands.

grep (-A|-B|-C)

Anonymous's picture

I use linux as my development environment. Grep with -A, -B and -C option provides me with context for what I'm searching. These options print out lines either above or below the target line.

For example I search for a function call and get grep to display the lines above and/or below the said line.

grep -p

iisdjp's picture

I come from the AIX world. We have many scripts that use grep -p to get the paragraph containing a search string. I cannot find anything comparable to grep -p in Red Hat Linux. Any ideas?

grep -p

iisdjp's picture

Never mind, I found this code. (Shoulda done more googling first! :-) )

#!/bin/sh

# usage: pargrep

inFile="$1"
searchString="$2"

awk '
BEGIN {
FS="\n"
RS=""
}
/'"$searchString"'/ { print }
' ${inFile}

non-standard args - could be made less confusing

Anonymous's picture

That's good as far as it goes, but you should reverse $1 and $2. NORMAL grep is:

grep

That pargrep is

grep

At least you should minimize the differences. Also, if you go with the "standard" way, a very small change to that script could be made to search across multiple files, just like grep would.

[sigh] stupid html. Ok, that

Anonymous's picture

[sigh] stupid html.

Ok, that should have read:

...
grep pattern filename

that pargrep is:

grep filename pattern
...

Question

Anonymous's picture

Dear Sir,

How can i search in my lunix server for a word in the sources code for any file that is located in the vhost. for example:

I need to search for the word "iframe" in all the "vhosts" folder on my server. this word is in the "Sources code for all my files for my websites"

Can u help me in this please ?
i will too much greatfull for you
regards
samer

The fastest method is...

Anonymous's picture

The following, replacing "*.php" by your source files' extension:

$ find . -name "*.php" -print0 | xargs -0 grep iframe

Search

Mitch Frazier's picture

There are a number of ways of doing that, try:

  $ find /path/to/vhosts -type f -exec grep --with-filename iframe {} \;

or try:

  $ grep -r iframe /path/to/vhosts/*

Mitch Frazier is an Associate Editor for Linux Journal.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState