Book Excerpt: A Practical Guide to Linux Commands, Editors, and Shell Programming
Arithmetic Operators
The gawk arithmetic operators listed in Table 12-4 are from the C programming language.
Table 12-4 Arithmetic operators
|
Operator |
Meaning |
|
** |
Raises the expression preceding the operator to the power of the expression following it |
|
* |
Multiplies the expression preceding the operator by the expression following it |
|
/ |
Divides the expression preceding the operator by the expression following it |
|
% |
Takes the remainder after dividing the expression preceding the operator by the expression following it |
|
+ |
Adds the expression preceding the operator to the expression following it |
|
– |
Subtracts the expression following the operator from the expression preceding it |
|
= |
Assigns the value of the expression following the operator to the variable preceding it |
|
++ |
Increments the variable preceding the operator |
|
– – |
Decrements the variable preceding the operator |
|
+= |
Adds the expression following the operator to the variable preceding it and assigns the result to the variable preceding the operator |
|
– = |
Subtracts the expression following the operator from the variable preceding it and assigns the result to the variable preceding the operator |
|
= |
Multiplies the variable preceding the operator by the expression following it and assigns the result to the variable preceding the operator |
|
/= |
Divides the variable preceding the operator by the expression following it and assigns the result to the variable preceding the operator |
|
%= |
Assigns the remainder, after dividing the variable preceding the operator by the expression following it, to the variable preceding the operator |
Associative Arrays
The associative array is one of gawk’s most powerful features. These arrays use strings as indexes. Using an associative array, you can mimic a traditional array by using numeric strings as indexes. In Perl, an associative array is called a hash.
You assign a value to an element of an associative array using the following syntax:
array[string] = value
where array is the name of the array, string is the index of the element of the array you are assigning a value to, and value is the value you are assigning to that element.
Using the following syntax, you can use a for structure with an associative array:
for (elem in array) action
where elem is a variable that takes on the value of each element of the array as the for structure loops through them, array is the name of the array, and action is the action that gawk takes for each element in the array. You can use the elem variable in this action.
For example programs that use associative arrays, look here.
printf
You can use the printf command in place of print to control the format of the output gawk generates. The gawk version of printf is similar to that found in the C language. A printf command has the following syntax:
printf "control-string", arg1, arg2, ..., argn
The control-string determines how printf formats arg1, arg2, ..., argn. These arguments can be variables or other expressions. Within the control-string you can use \n to indicate a NEWLINE and \t to indicate a TAB. The control-string contains conversion specifications, one for each argument. A conversion specification has the following syntax:
%[–][x[.y]]conv
where – causes printf to left-justify the argument, x is the minimum field width, and .y is the number of places to the right of a decimal point in a number. The conv indicates the type of numeric conversion and can be selected from the letters in Table 12-5. See example programs that use printf.
Table 12-5 Numeric conversion
|
conv |
Type of conversion |
|
d |
Decimal |
|
e |
Exponential notation |
|
f |
Floating-point number |
|
g |
Use f or e, whichever is shorter |
|
o |
Unsigned octal |
|
s |
String of characters |
|
x |
Unsigned hexadecimal |
Control Structures
Control (flow) statements alter the order of execution of commands within a gawk program. This section details the if...else, while, and for control structures. In addition, the break and continue statements work in conjunction with the control structures to alter the order of execution of commands. See page 398 for more information on control structures. You do not need to use braces around commands when you specify a single, simple command.
if...else
The if...else control structure tests the status returned by the condition and transfers control based on this status. The syntax of an if...else structure is shown below. The else part is optional.
if (condition)
{commands}
[else
{commands}]
The simple if statement shown here does not use braces:
if ($5 <= 5000) print $0
Next is a gawk program that uses a simple if...else structure. Again, there are no braces.
$ cat if1
BEGIN {
nam="sam"
if (nam == "max")
print "nam is max"
else
print "nam is not max, it is", nam
}
$ gawk -f if1
nam is not max, it is sam
while
The while structure loops through and executes the commands as long as the condition is true. The syntax of a while structure is
while (condition)
{commands}
The next gawk program uses a simple while structure to display powers of 2. This example uses braces because the while loop contains more than one statement. This program does not accept input; all processing takes place when gawk executes the statements associated with the BEGIN pattern.
$ cat while1
BEGIN {
n = 1
while (n <= 5)
{
print "2^" n, 2n
n++
}
}
$ gawk -f while1
1^2 2
2^2 4
3^2 8
4^2 16
5^2 32
for
The syntax of a for control structure is
for (init; condition; increment)
{commands}
A for structure starts by executing the init statement, which usually sets a counter to 0 or 1. It then loops through the commands as long as the condition remains true. After each loop it executes the increment statement. The for1 gawk program does the same thing as the preceding while1 program except that it uses a for statement, which makes the program simpler:
$ cat for1
BEGIN {
for (n=1; n <= 5; n++)
print "2^" n, 2n
}
$ gawk -f for1
1^2 2
2^2 4
3^2 8
4^2 16
5^2 32
The gawk utility supports an alternative for syntax for working with associative arrays:
for (var in array)
{commands}
This for structure loops through elements of the associative array named array, assigning the value of the index of each element of array to var each time through the loop. The following line of code demonstrates a for structure:
END {for (name in manuf) print name, manuf[name]}
break
The break statement transfers control out of a for or while loop, terminating execution of the innermost loop it appears in.
continue
The continue statement transfers control to the end of a for or while loop, causing execution of the innermost loop it appears in to continue with the next iteration.
Examples
cars data file
Many of the examples in this section work with the cars data file. From left to right, the columns in the file contain each car’s make, model, year of manufacture, mileage in thousands of miles, and price. All whitespace in this file is composed of single TAB s (the file does not contain any SPACEs).
$ cat cars plym fury 1970 73 2500 chevy malibu 1999 60 3000 ford mustang 1965 45 10000 volvo s80 1998 102 9850 ford thundbd 2003 15 10500 chevy malibu 2000 50 3500 bmw 325i 1985 115 450 honda accord 2001 30 6000 ford taurus 2004 10 17000 toyota rav4 2002 180 750 chevy impala 1985 85 1550 ford explor 2003 25 9500
Missing pattern
A simple gawk program is
{ print }
This program consists of one program line that is an action. Because the pattern is missing, gawk selects all lines of input. When used without any arguments the print command displays each selected line in its entirety. This program copies the input to standard output.
$ gawk '{ print }' cars
plym fury 1970 73 2500
chevy malibu 1999 60 3000
ford mustang 1965 45 10000
volvo s80 1998 102 9850
...
Missing action
The next program has a pattern but no explicit action. The slashes indicate that chevy is a regular expression.
/chevy/
In this case gawk selects from the input just those lines that contain the string chevy. When you do not specify an action, gawk assumes the action is print. The following example copies to standard output all lines from the input that contain the string chevy:
$ gawk '/chevy/' cars chevy malibu 1999 60 3000 chevy malibu 2000 50 3500 chevy impala 1985 85 1550
Single quotation marks
Although neither gawk nor shell syntax requires single quotation marks on the command line, it is still a good idea to use them because they can prevent problems. If the gawk program you create on the command line includes SPACE s or characters that are special to the shell, you must quote them. Always enclosing the program in single quotation marks is the easiest way to make sure you have quoted any characters that need to be quoted.
Fields
The next example selects all lines from the file (it has no pattern). The braces enclose the action; you must always use braces to delimit the action so gawk can distinguish it from the pattern. This example displays the third field ($3), a SPACE (the output field separator, indicated by the comma), and the first field ($1) of each selected line:
$ gawk '{print $3, $1}' cars
1970 plym
1999 chevy
1965 ford
1998 volvo
...
The next example, which includes both a pattern and an action, selects all lines that contain the string chevy and displays the third and first fields from the selected lines:
$ gawk '/chevy/ {print $3, $1}' cars
1999 chevy
2000 chevy
1985 chevy
In the following example, gawk selects lines that contain a match for the regular expression h. Because there is no explicit action, gawk displays all the lines it selects.
$ gawk '/h/' cars chevy malibu 1999 60 3000 ford thundbd 2003 15 10500 chevy malibu 2000 50 3500 honda accord 2001 30 6000 chevy impala 1985 85 1550
~ (matches operator)
The next pattern uses the matches operator (~) to select all lines that contain the letter h in the first field:
$ gawk '$1 ~ /h/' cars chevy malibu 1999 60 3000 chevy malibu 2000 50 3500 honda accord 2001 30 6000 chevy impala 1985 85 1550
The caret (^) in a regular expression forces a match at the beginning of the line or, in this case, at the beginning of the first field:
$ gawk '$1 ~ /^h/' cars honda accord 2001 30 6000
Brackets surround a character class definition. In the next example, gawk selects lines that have a second field that begins with t or m and displays the third and second fields, a dollar sign, and the fifth field. Because there is no comma between the "$" and the $5, gawk does not put a SPACE between them in the output.
$ gawk '$2 ~ /^[tm]/ {print $3, $2, "$" $5}' cars
1999 malibu $3000
1965 mustang $10000
2003 thundbd $10500
2000 malibu $3500
2004 taurus $17000
Dollar signs
The next example shows three roles a dollar sign can play in a gawk program. First, a dollar sign followed by a number names a field. Second, within a regular expression a dollar sign forces a match at the end of a line or field (5$). Third, within a string a dollar sign represents itself.
$ gawk '$3 ~ /5$/ {print $3, $1, "$" $5}' cars
1965 ford $10000
1985 bmw $450
1985 chevy $1550
In the next example, the equal-to relational operator (= =) causes gawk to perform a numeric comparison between the third field in each line and the number 1985. The gawk command takes the default action, print, on each line where the comparison is true.
$ gawk '$3 == 1985' cars bmw 325i 1985 115 450 chevy impala 1985 85 1550
The next example finds all cars priced at or less than $3,000.
$ gawk '$5 <= 3000' cars plym fury 1970 73 2500 chevy malibu 1999 60 3000 bmw 325i 1985 115 450 toyota rav4 2002 180 750 chevy impala 1985 85 1550
Textual comparisons
When you use double quotation marks, gawk performs textual comparisons by using the ASCII (or other local) collating sequence as the basis of the comparison. In the following example, gawk shows that the strings 450 and 750 fall in the range that lies between the strings 2000 and 9000, which is probably not the intended result.
$ gawk '"2000" <= $5 && $5 < "9000"' cars plym fury 1970 73 2500 chevy malibu 1999 60 3000 chevy malibu 2000 50 3500 bmw 325i 1985 115 450 honda accord 2001 30 6000 toyota rav4 2002 180 750
When you need to perform a numeric comparison, do not use quotation marks. The next example gives the intended result. It is the same as the previous example except it omits the double quotation marks.
$ gawk '2000 <= $5 && $5 < 9000' cars plym fury 1970 73 2500 chevy malibu 1999 60 3000 chevy malibu 2000 50 3500 honda accord 2001 30 6000
, (range operator)
The range operator ( , ) selects a group of lines. The first line it selects is the one specified by the pattern before the comma. The last line is the one selected by the pattern after the comma. If no line matches the pattern after the comma, gawk selects every line through the end of the input. The next example selects all lines, starting with the line that contains volvo and ending with the line that contains bmw.
$ gawk '/volvo/ , /bmw/' cars volvo s80 1998 102 9850 ford thundbd 2003 15 10500 chevy malibu 2000 50 3500 bmw 325i 1985 115 450
After the range operator finds its first group of lines, it begins the process again, looking for a line that matches the pattern before the comma. In the following example, gawk finds three groups of lines that fall between chevy and ford. Although the fifth line of input contains ford, gawk does not select it because at the time it is processing the fifth line, it is searching for chevy.
$ gawk '/chevy/ , /ford/' cars chevy malibu 1999 60 3000 ford mustang 1965 45 10000 chevy malibu 2000 50 3500 bmw 325i 1985 115 450 honda accord 2001 30 6000 ford taurus 2004 10 17000 chevy impala 1985 85 1550 ford explor 2003 25 9500
When you are writing a longer gawk program, it is convenient to put the program in a file and reference the file on the command line. Use the –f (– –file) option followed by the name of the file containing the gawk program.
The following gawk program, which is stored in a file named pr_header, has two actions and uses the BEGIN pattern. The gawk utility performs the action associated with BEGIN before processing any lines of the data file: It displays a header. The second action, {print}, has no pattern part and displays all lines from the input.
$ cat pr_header
BEGIN {print "Make Model Year Miles Price"}
{print}
$ gawk -f pr_header cars
Make Model Year Miles Price
plym fury 1970 73 2500
chevy malibu 1999 60 3000
ford mustang 1965 45 10000
volvo s80 1998 102 9850
...
The next example expands the action associated with the BEGIN pattern. In the previous and the following examples, the whitespace in the headers is composed of single TABs, so the titles line up with the columns of data.
$ cat pr_header2
BEGIN {
print "Make Model Year Miles Price"
print "----------------------------------------"
}
{print}
$ gawk -f pr_header2 cars
Make Model Year Miles Price
----------------------------------------
plym fury 1970 73 2500
chevy malibu 1999 60 3000
ford mustang 1965 45 10000
volvo s80 1998 102 9850
...
length function
When you call the length function without an argument, it returns the number of characters in the current line, including field separators. The $0 variable always contains the value of the current line. In the next example, gawk prepends the line length to each line and then a pipe sends the output from gawk to sort (the –n option specifies a numeric sort). As a result, the lines of the cars file appear in order of line length.
$ gawk '{print length, $0}' cars | sort -n
21 bmw 325i 1985 115 450
22 plym fury 1970 73 2500
23 volvo s80 1998 102 9850
24 ford explor 2003 25 9500
24 toyota rav4 2002 180 750
25 chevy impala 1985 85 1550
25 chevy malibu 1999 60 3000
25 chevy malibu 2000 50 3500
25 ford taurus 2004 10 17000
25 honda accord 2001 30 6000
26 ford mustang 1965 45 10000
26 ford thundbd 2003 15 10500
The formatting of this report depends on TAB s for horizontal alignment. The three extra characters at the beginning of each line throw off the format of several lines; a remedy for this situation is covered shortly.
NR (record number)
The NR variable contains the record (line) number of the current line. The following pattern selects all lines that contain more than 24 characters. The action displays the line number of each of the selected lines.
$ gawk 'length > 24 {print NR}' cars
2
3
5
6
8
9
11
You can combine the range operator ( , ) and the NR variable to display a group of lines of a file based on their line numbers. The next example displays lines 2 through 4:
$ gawk 'NR == 2 , NR == 4' cars chevy malibu 1999 60 3000 ford mustang 1965 45 10000 volvo s80 1998 102 9850
END
The END pattern works in a manner similar to the BEGIN pattern, except gawk takes the actions associated with this pattern after processing the last line of input. The following report displays information only after it has processed all the input. The NR variable retains its value after gawk finishes processing the data file, so an action associated with an END pattern can use it.
$ gawk 'END {print NR, "cars for sale." }' cars
12 cars for sale.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- The Pari Package On Linux
- New Products
- New Products
- Home, My Backup Data Center
- This is the easiest tutorial
3 hours 57 min ago - Ahh, the Koolaid.
9 hours 35 min ago - git-annex assistant
15 hours 35 min ago - direct cable connection
15 hours 58 min ago - Agreed on AirDroid. With my
16 hours 8 min ago - I just learned this
16 hours 12 min ago - enterprise
16 hours 42 min ago - not living upto the mobile revolution
19 hours 33 min ago - Deceptive Advertising and
20 hours 9 min ago - Let\'s declare that you have
20 hours 10 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.



Comments
Practical Guide to Linux
Great guide and tuto !
I needed learn command for Linux. I just start use this os and want see all possibility.
I think you can do more things if you understant how work basic interface.
Thank's again.
Good week end :)
Great book
I bought this book to learn Linux commands. It's quite easy to undersand. I recommand this book !
Vince from Roulette Website
excelent subject
great article with great tuto, thanks for your share and your time which you spend for us !
Nico from : guide de jeux
thanks dear,
thanks dear,
I like this site, simply
I like this site, simply amazing.I bookmark and check back soon. Please check out my site as well and let me know what you think.
Book
There really is a lot of detail in this one article. How many pages was this?! Anyway, it is filled with some very useful information. Thanks for taking the time to research and post it for us.