Filters: Doing It Your Way
Next, we'll think about doublespacing a text file. We can do this using sed's substitute command by replacing $ (the regexp for the end of a line) with a newline character (which we have to quote with a backslash)
sed 's/$/\ /' foo
Note that in this example, there isn't a g before the second quote, unlike all the earlier examples. The g is used to tell sed that the substitution applies to all matches on each line, not just the first match on each line, which is the default behaviour. In this case, since each line only has one end, we don't need the g.
Another way of doing this in sed would be:
sed G foo
If you look at the man page for sed, it says that G “appends a newline character followed by the contents of the hold space to the pattern space”. The pattern space is the sed term for the line currently being read, and we don't need to worry about the hold space for now (trust me, it will be empty), so this command does exactly what we want.
It's quite easy to doublespace in awk, using the print statement we saw earlier:
awk '{print $0; print ""}' foo
Here, the pattern is empty again, matching every line, and the action is to print the entire line, $0, then to print nothing, "". Each print statement starts a new line, so the combined effect of the two commands is to doublespace the file.
Awk actions can (and often do) involve more than one command in this way, but it isn't strictly necessary here. Awk provides a formatted print statement that gives more control over the output than the basic print statement. So we could get the same result with:
awk '{printf("%s\n\n",$0)}' foo
The first argument to the printf statement is the format, a description of how the output should appear. The format can contain characters to be printed literally (none in this example), escape sequences (such as \n for a newline), and specifications. A specification is a sequence of characters beginning with a % that controls how the rest of the arguments are printed. For each of the second and subsequent arguments, there must be a specification. In this example, there is one specification, %s, which prints a character string. The value associated with that specification is $0; the entire line. Unlike print, printf doesn't automatically start a new line, so two \n's are needed: one to end the original line and one to insert a blank line.
For this seemingly simple example—doublespacing a file—we came up with four different solutions. There is always more than one way of solving a problem, and it normally doesn't matter which one you take. The point is that you usually write an awk or sed program to do a particular task as the need arises, then discard it. You don't necessarily want the “best” solution (whatever that means), you just want something that works, and you want it quickly.
Another quite common task is to select just part of the input. Suppose we want the fifth line of the file foo. In awk, this would be
awk 'NR==5' foo
which prints the line when NR, the number of lines read so far, equals 5. The sed equivalent is
sed -n 5p foo
By default, sed prints every line of input after all commands have been applied. The -n option suppresses this behaviour, so we only get the line we specifically ask for with the p command. In this case, we asked for the fifth line, but we could just as easily specified a range of lines, say the third to the fifth, with:
sed -n 3,5p foo
or, in awk
awk 'NR>=3 && NR<=5' foo
In the awk version, the && means “and”, so we want the lines where NR>=3 and NR<=5, that is, the third through the fifth lines.
Yet another approach would be to combine head and tail
head -5 foo | tail -3
which uses the head program to get the first 5 lines of the file, and the tail program to only pass the last three lines through.
Yet another common problem is removing only the first line. Remember how the $ character means the end of the line when it is used in a regular expression? Well, when you use it to specify a line number, it means the last line:
sed -n '2,$p' foo
In awk, you can use != or > to get the same result from either of these commands:
awk 'NR>1' foo awk 'NR!=1' foo
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




19 min 18 sec ago
2 hours 12 min ago
9 hours 6 min ago
9 hours 22 min ago
11 hours 14 min ago
17 hours 6 min ago
21 hours 37 min ago
21 hours 38 min ago
23 hours 38 min ago
1 day 8 hours ago