Monitoring Virtual Memory with vmstat

Just using a lot of swap space doesn't necessarily mean that you need more memory. Here's how to tell when Linux is happy with the available memory and when it needs more.

Linux novices often find virtual memory mysterious, but with a grasp of the fundamental concepts, it's easy to understand. With this knowledge, you can monitor your system's memory utilization using vmstat and detect problems that can adversely affect system performance.

How Virtual Memory Works

Physical memory—the actual RAM installed—is a finite resource on any system. The Linux memory handler manages the allocation of that limited resource by freeing portions of physical memory when possible.

All processes use memory, of course, but each process doesn't need all its allocated memory all the time. Taking advantage of this fact, the kernel frees up physical memory by writing some or all of a process' memory to disk until it's needed again.

The kernel uses paging and swapping to perform this memory management. Paging refers to writing portions, termed pages, of a process' memory to disk. Swapping, strictly speaking, refers to writing the entire process, not just part, to disk. In Linux, true swapping is exceedingly rare, but the terms paging and swapping often are used interchangeably.

When pages are written to disk, the event is called a page-out, and when pages are returned to physical memory, the event is called a page-in. A page fault occurs when the kernel needs a page, finds it doesn't exist in physical memory because it has been paged-out, and re-reads it in from disk.

Page-ins are common, normal and are not a cause for concern. For example, when an application first starts up, its executable image and data are paged-in. This is normal behavior.

Page-outs, however, can be a sign of trouble. When the kernel detects that memory is running low, it attempts to free up memory by paging out. Though this may happen briefly from time to time, if page-outs are plentiful and constant, the kernel can reach a point where it's actually spending more time managing paging activity than running the applications, and system performance suffers. This woeful state is referred to as thrashing.

Using swap space is not inherently bad. Rather, it's intense paging activity that's problematic. For instance, if your most-memory-intensive application is idle, it's fine for portions of it to be set aside when another large job is active. Memory pages belonging to an idle application are better set aside so the kernel can use physical memory for disk buffering.

Using vmstat

vmstat, as its name suggests, reports virtual memory statistics. It shows how much virtual memory there is, how much is free and paging activity. Most important, you can observe page-ins and page-outs as they happen. This is extremely useful.

To monitor the virtual memory activity on your system, it's best to use vmstat with a delay. A delay is the number of seconds between updates. If you don't supply a delay, vmstat reports the averages since the last boot and quit. Five seconds is the recommended delay interval.

To run vmstat with a five-second delay, type:

vmstat 5

You also can specify a count, which indicates how many updates you want to see before vmstat quits. If you don't specify a count, the count defaults to infinity, but you can stop output with Ctrl-C.

To run vmstat with ten updates, five seconds apart, type:

vmstat 5 10

Here's an example of a system free of paging activity:

  procs                     memory    swap        io     system cpu
r  b  w   swpd   free  buff  cache  si so   bi  bo   in    cs us sy  id
0  0  0  29232 116972  4524 244900   0  0    0   0    0     0 0   0   0
0  0  0  29232 116972  4524 244900   0  0    0   0 2560     6 0   1  99
0  0  0  29232 116972  4524 244900   0  0    0   0 2574    10 0   2  98

All fields are explained in the vmstat man page, but the most important columns for this article are free, si and so. The free column shows the amount of free memory, si shows page-ins and so shows page-outs. In this example, the so column is zero consistently, indicating there are no page-outs.

The abbreviations so and si are used instead of the more accurate po and pi for historical reasons.

Here's an example of a system with paging activity:

  procs                      memory    swap          io     system cpu
r  b  w   swpd   free  buff cache  si  so   bi   bo   in    cs us  sy  id
. . .
1  0  0  13344   1444  1308 19692   0 168  129   42 1505   713 20  11  69
1  0  0  13856   1640  1308 18524  64 516  379  129 4341   646 24  34  42
3  0  0  13856   1084  1308 18316  56  64   14    0  320  1022 84   9   8

Notice the nonzero so values indicating there is not enough physical memory and the kernel is paging out. You can use top and ps to identify the processes that are using the most memory.

You also can use top to show memory and swap statistics. Here is an example of the uppermost portion of a typical top report:

14:23:19 up 348 days,  3:02,  1 user,  load average: 0.00, 0.00, 0.00
55 processes: 54 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:   0.0% user,   2.4% system,   0.0% nice,  97.6% idle
Mem:    481076K total,   367508K used,   113568K free,     4712K buffers
Swap:  1004052K total,    29852K used,   974200K free,   244396K cached

For more information about top, see the top man page.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

"A page fault occurs when

Anonymous's picture

"A page fault occurs when the kernel needs a page, finds it doesn't exist in physical memory because it has been paged-out, and re-reads it in from disk"

I'm wondering if that's supposed to read

"A page in occurs when the kernel needs a page, finds it doesn't exist in physical memory because it has been paged-out, and re-reads it in from disk"

Since there is no other mention of page fault in the article and this one instance of page fault is surrounded by talk about page in.

Page Fault/In

Mitch Frazier's picture

The reaction to a "page fault" is a "page in".

The jist of what's written is correct, but there are a couple of minor errors. Page faults occur when any application (and not just the kernel) needs a page that is not currently in memory. A page fault is also not a "kernel" thing, it's a hardware thing. The reaction to the fault, often a "page in" is handled by the kernel. Also, the setup required to support virtual memory is handled by the kernel, but the fault itself is all hardware.

There is also a slight inaccuracy in stating that the fault occurs because the page was previously paged out. The page that causes the fault does not have to have been previously paged out. If the page referred to program code it doesn't get paged out, it just gets paged in when the fault occurs since the code exists in the executable file and doesn't need to be paged out. A page fault can also occur when trying to access memory beyond the end of the stack, thus signalling the O/S that more stack space needs to be allocated, but this again is memory that was never paged out. Here the page is just used as a guard or sentry to tell the O/S when the application needs more stack space.

So I would reword that sentence thusly:

A page fault occurs when the CPU needs a page which does not exist in physical memory. In response to the fault the kernel allocates a free page for the missing memory and if the page was previously paged out, the kernel pages in the previously paged out data. If the kernel is unable to find an unused page to allocate for the missing memory then an existing page will be freed and if need be paged out.

Mitch Frazier is an Associate Editor for Linux Journal.

Hi, If you need to draw

Anonymous's picture

Hi,

If you need to draw charts from your vmstat log files,
this may help you :
http://www.michenux.net/blog/?p=1

;)
Bye

This would be great if you

Ian's picture

This would be great if you explained each column of these displays.

Thanks for the great article

LC's picture

Thanks for the great article Brian. This is really helpful. Virtual memory was a mystery to me, but now I understand it a lot better.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState