Ten Commands Every Linux Developer Should Know

A few simple utilities can make it easier to figure out and maintain other people's code.
3. fuser

The name is a mnemonic for file user and tells what processes have opened a given file. It also can send a signal to all those processes for you. Suppose you want to delete a file but can't because some program has it open and won't close it. Instead of rebooting, type fuser -k myfile. This sends a SIGTERM to every process that has myfile opened.

Perhaps you need to kill a process that forked itself all over the place, intentionally or otherwise. An unenlightened programmer might type something like ps | grep myprogram. This inevitably would be followed by several cut-and-paste operations with the mouse. An easier way is to type fuser -k ./myprogram, where myprogram is the pathname of the executable. fuser typically is located in /sbin, which generally is reserved for system administrative tools. You can add /usr/sbin and /sbin to the end of your $PATH.

4. ps

ps is used to find process status, but many people don't realize it also can be a powerful debugging tool. To get at these features, use the -o option, which lets you access many details of your processes, including CPU usage, virtual memory usage, current state and much more. Many of these options are defined in the POSIX standard, so they work across platforms.

To look at your running commands by pid and process state, type ps -e -o pid,state,cmd. The output looks like this:

4576 S /opt/OpenOffice.org1.1.0/program/soffice.bin -writer
4618 D dd if /dev/cdrom of /dev/null
4619 S bash
4645 R ps -e -o pid,state,cmd

Here you can see my dd command is in an uninterruptible sleep (state D). Basically, it is blocking while waiting for /dev/cdrom. My OpenOffice.org writer is sleeping (state S) while I type my example, and my ps command is running (state R).

For an idea of how a running program is performing, type:

ps -o start,time,etime -p mypid

This shows the basic output from the time command, discussed later, except you don't have to wait until your program is finished.

Most of the information that ps produces is available from the /proc filesystem, but if you are writing a script, using ps is more portable. You never know when a minor kernel rev will break all of your scripts that are mining the /proc filesystem. Use ps instead.

5. time

The time command is useful for understanding your code's performance. The most basic output consists of real, user and system time. Intuitively, real time is the amount of time between when the code started and when it exited. User time and system time are the amount of time spent executing application code versus kernel code, respectively.

Two flavors of the time command are available. The shell has a built-in version that tells you only scheduler information. A version in /usr/bin includes more information and allows you to format the output. You easily can override the built-in time command by preceding it with a backslash, as in the examples that follow.

A basic knowledge of the Linux scheduler is helpful in interpreting the output, but this tool also is helpful for learning how the scheduler works. For example, the real time of a process typically is larger than the sum of the user and system time. Time spent blocking in a system call does not count against the process, because the scheduler is free to schedule other processes during this time. The following sleep command takes one second to execute but takes no measurable system or user time:

\time -p sleep 1
real 1.03
user 0.00
sys 0.00

The next example shows how a task can spend all of its time in user space. Here, Perl calls the log() function in a loop, which requires nothing from the kernel:

\time perl -e 'log(2.0) foreach(0..0x100000)'
real 0.40
user 0.20
sys 0.00

This example shows a process using a lot of memory:

\time perl -e '$x = 'a' x 0x1000000'

0.06user 0.12system 0:00.22elapsed 81%CPU
(0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (309major+8235minor)pagefaults
0swaps

The useful information here is listed as pagefaults. Although the GNU time command advertises a lot of information, the 2.4 series of the Linux kernel stores only major and minor page-fault information. A major page fault is one that requires I/O; a minor page fault does not.

6. nm

This command allows you to retrieve information on symbol names inside an object file or executable file. By default, the output gives you a symbol name and its virtual address. What good is that? Suppose you are compiling code and the compiler complains that you have an unresolved symbol _foo. You search all of your source code and cannot find anywhere where you use this symbol. Perhaps it got pulled in from some template or a macro buried in one of the dozens of include files that compiled along with your code. The command:

nm -guA *.o | grep foo

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

CRASH

flostre's picture

In its current version, this page crashes Konqueror 3.5.5.

great for interviews

ak_boy's picture

absolutely fantastico quick summary for interviews ... especially when people ask you to name your favorite unix/linux commands ... believe me, "they" do ...

Great article

Anonymous's picture

This is the stuff I look for in LJ Magazine.
Kudos to Mr Fusco!

a few more

undefined's picture

many of your commands are more beneficial to system administrators than programmers, imho (as i wear both hats), so i add the following:

lsof
netstat
ethereal

lsof is the opposite of fuser: instead of what processes have open a file, it tells what files a process has open.

netstat tells what ports are bound to, and will even list the specific processes if you own them.

for debugging network applications, nothing beats ethereal. it's the strace of networking.

i really enjoyed the article and learned some new things (fuser, od, xxd). it's general articles like these (applicable to any serious linux user) that keeps me subscribed to the dead-tree lj.

what is equivalent of

Anonymous's picture

what is equivalent of "pstack" to view a core file in Linux.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix