Text Manipulation with sed

Replace text on the fly, without even starting an editor, using this classic tool.
Three Great Regular Expressions to Know

For those who have never used regular expressions, here are three regular expressions that are very useful when combined with sed:

  1. To match the start of a line, use the ^ character.

  2. To match the end of a line, use the $ character.

  3. To match any number of characters in a regular expression, use the characters .*. The . matches any single character, and the * matches any number of characters (including none at all).

Practical Examples

Filter out empty lines from a file:


sed -e '/^$/d' your_file.txt

Add the computer named mycomputer to the end of every line in /etc/exports:


cat /etc/exports |  \
sed -e 's/$/ mycomputer/' > /etc/exports

Add the computer named comp2 only to the directories beginning with /data/ in /etc/exports:


cat /etc/exports | \
sed -e '/^\/data\//s/$/ comp2/' > /etc/exports

See how the forward slashes used in the directory name have to be escaped using back slashes? Without the back slashes, sed interprets the forward slashes in the directory specifier as the delimiters in the sed command itself. However, the back slashes can make the sed command difficult to read and follow.

Remove the first word on each line (including any leading spaces and the trailing space):


cat test3.txt | sed -e 's/^ *[^ ]* //'

More regular expression matching is used in this example. Here's what it is doing.

The initial ^ * is used to match any number of spaces at the beginning of the line. The [^ ]* then matches any number of characters that are not spaces (the ^ inside the brace reverses the match on the space), so it matches a single word. The trailing space at the end matches the space found at the end of the first word. The empty replace pattern removes the text.

Remove the last word on each line:


cat test3.txt | sed -e 's/^\(.*\) .*/\1/'

This command introduces the concept of hold buffers. Hold buffers are used to keep parts of the matched text and to insert that text into the result. The pattern that matches the text between the parentheses is recalled in the substitution pattern by the \1. If an additional set of parentheses were in the match pattern, they would be addressed in the substitution pattern as \2, and so on, for more sets of parentheses. Up to nine hold buffers can be specified. In this example, the pattern contained within the parentheses matches from the start of the line up to the last space (the space after the parentheses).

To remove leading { and trailing }, or a } from each line:


sed -e 's/^.*{\(.*\)},*/\1/' table.txt

I'll leave it to the reader to dig in to this regular expression to see how it operates. Keep this in mind—the more comfortable you are with regular expressions and hold buffers, the more powerful the sed command becomes.

Conclusion

sed recognizes many other commands. However, even with these basic commands, you can successfully manipulate text files from within your own shell scripts or right from the command line.

Larry Richardson develops meteorological workstation software for 3SI. He has developed software for UNIX and Windows using C and C++ for more than 13 years. Now living in Georgia with his wife and son, he enjoys playing bass in his spare time.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState