Linux and the Alpha
The Alpha chips, like most other modern CPUs, provide a variety of performance counters. These allow measuring various event counts or rates, such as the number of cache misses, instruction issue rate, branch mispredicts or instruction frequency. Unfortunately, I am not aware of any Linux API that would provide access to these counters. This is particularly unfortunate since both the Pentium and the Pentium Pro chips provide similar counters. Digital UNIX gives access to these counters via the uprofile and kprofile programs, and an ioctl-based interface documented in the pfm(7) man page. Hopefully, something similar (but more general) will eventually become available for Linux. With the proper tools, these counters can provide a wealth of information.
Most readers are probably familiar with the original gprof (see Reference 3). It's a handy tool to determine the primary performance bottlenecks at the function level. However, with the help of gcc, GNU gprof can also look inside a function. We illustrate this with a truly trivial function that computes the factorial. Assume we've typed up the factorial function and a simple test program in file fact.c. We can then compile that program like this (assuming GNU libc version 2.0 or later is installed):
gcc -g -O -a fact.c -lc
Invoking the resulting a.out binary once produces a gmon.out file that contains the execution counts for each basic block in the program. We can look at these counts by invoking gprof, specifying the -l and --annotate options. This command generates a source-code listing that shows how many times a basic block in each line of source code has been executed.
Our factorial example results in the listing shown in Listing 1. The basic block starting at the printf line in function main() was executed once, so it has been annotated with a 1. For the factorial function, the function prologue and epilogue were executed 20 times each, so the first and last line of function fact are annotated with 20. Of these 20 calls, 19 resulted in a recursive call to fact, and the remaining call simply returned 1. Correspondingly, the then branch of the if statement has been annotated with 19, whereas the else statement has an annotation of 1. It's that simple.
There certainly are no surprises in the behavior of function fact(), but in realistic, more complicated functions or in code that was written by somebody else, this knowledge can be very helpful to avoid wasting time optimizing rarely executed code.
Next month, we'll look into the techniques that can greatly improve the performance of a given piece of code. Most of them are not novel. Some of them have been around for so long that it would be difficult, if not impossible, to give proper credit. Others are “obvious” (once you know them). The key point is that it is the characteristics of modern, particularly Alpha-based, systems which make these techniques so important and worthwhile.
The author would like to thank Richard Henderson of Texas A&M University and Erik Troan of Red Hat Software for reviewing this paper on short notice. Their feedback greatly improved its quality. Errors and omissions are the sole responsibility of the author.
David is a graduate student in the Ph.D. program of the Computer Science department at the University of Arizona. He plans on graduating in August 1997 and to finally get a “Real Job”. After a short intermezzo with Linux involving Reed-Solomon codes and the floppy-tape driver, he forgot about Linux until the need arose for a low-cost, high-performance system based on Digital's Alpha processor. As a result, he got involved in the Linux/Alpha port and has been sticking around in the free software community ever since. When not playing with computers, he enjoys the outdoors with his lovely wife. He can be reached via e-mail at David.Mosberger@acm.org.
Getting Started with DevOps - Including New Data on IT Performance from Puppet Labs 2015 State of DevOps Report
August 27, 2015
12:00 PM CDT
DevOps represents a profound change from the way most IT departments have traditionally worked: from siloed teams and high-anxiety releases to everyone collaborating on uneventful and more frequent releases of higher-quality code. It doesn't matter how large or small an organization is, or even whether it's historically slow moving or risk averse — there are ways to adopt DevOps sanely, and get measurable results in just weeks.
Free to Linux Journal readers.Register Now!
- Django Models and Migrations
- Hacking a Safe with Bash
- Secure Server Deployments in Hostile Territory, Part II
- Huge Package Overhaul for Debian and Ubuntu
- The Controversy Behind Canonical's Intellectual Property Policy
- Home Automation with Raspberry Pi
- Shashlik - a Tasty New Android Simulator
- Embed Linux in Monitoring and Control Systems
- KDE Reveals Plasma Mobile
- diff -u: What's New in Kernel Development