Text Manipulation with sed

Replace text on the fly, without even starting an editor, using this classic tool.
Three Great Regular Expressions to Know

For those who have never used regular expressions, here are three regular expressions that are very useful when combined with sed:

  1. To match the start of a line, use the ^ character.

  2. To match the end of a line, use the $ character.

  3. To match any number of characters in a regular expression, use the characters .*. The . matches any single character, and the * matches any number of characters (including none at all).

Practical Examples

Filter out empty lines from a file:


sed -e '/^$/d' your_file.txt

Add the computer named mycomputer to the end of every line in /etc/exports:


cat /etc/exports |  \
sed -e 's/$/ mycomputer/' > /etc/exports

Add the computer named comp2 only to the directories beginning with /data/ in /etc/exports:


cat /etc/exports | \
sed -e '/^\/data\//s/$/ comp2/' > /etc/exports

See how the forward slashes used in the directory name have to be escaped using back slashes? Without the back slashes, sed interprets the forward slashes in the directory specifier as the delimiters in the sed command itself. However, the back slashes can make the sed command difficult to read and follow.

Remove the first word on each line (including any leading spaces and the trailing space):


cat test3.txt | sed -e 's/^ *[^ ]* //'

More regular expression matching is used in this example. Here's what it is doing.

The initial ^ * is used to match any number of spaces at the beginning of the line. The [^ ]* then matches any number of characters that are not spaces (the ^ inside the brace reverses the match on the space), so it matches a single word. The trailing space at the end matches the space found at the end of the first word. The empty replace pattern removes the text.

Remove the last word on each line:


cat test3.txt | sed -e 's/^\(.*\) .*/\1/'

This command introduces the concept of hold buffers. Hold buffers are used to keep parts of the matched text and to insert that text into the result. The pattern that matches the text between the parentheses is recalled in the substitution pattern by the \1. If an additional set of parentheses were in the match pattern, they would be addressed in the substitution pattern as \2, and so on, for more sets of parentheses. Up to nine hold buffers can be specified. In this example, the pattern contained within the parentheses matches from the start of the line up to the last space (the space after the parentheses).

To remove leading { and trailing }, or a } from each line:


sed -e 's/^.*{\(.*\)},*/\1/' table.txt

I'll leave it to the reader to dig in to this regular expression to see how it operates. Keep this in mind—the more comfortable you are with regular expressions and hold buffers, the more powerful the sed command becomes.

Conclusion

sed recognizes many other commands. However, even with these basic commands, you can successfully manipulate text files from within your own shell scripts or right from the command line.

Larry Richardson develops meteorological workstation software for 3SI. He has developed software for UNIX and Windows using C and C++ for more than 13 years. Now living in Georgia with his wife and son, he enjoys playing bass in his spare time.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix