Coverage Measurement and Profiling
Here is the output from running this simple Fibonacci program:
[zfrey@warthog prof]$ time ./fib fibonnaci(0) = 0 fibonnaci(1) = 1 fibonnaci(2) = 1 fibonnaci(3) = 2 fibonnaci(4) = 3 ... fibonnaci(40) = 102334155 fibonnaci(41) = 165580141 fibonnaci(42) = 267914296 real 3m12.580s user 3m10.566s sys 0m0.127s [zfrey@warthog prof]$
Now, we can look at the profiling data:
[zfrey@warthog prof]$ gprof -b ./fib gmon.out
Flat profile:
Each sample counts as 0.00195312 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
95.13 16.34 16.34 43 380.02 380.02 fibonacci
4.87 17.18 0.84 main
Call graph
index % time self children called name
[1] 100.0 0.84 16.34 main [1]
16.34 0.00 43/43 fibonacci [2]
-----------------------------------------------
2269806252 fibonacci [2]
16.34 0.00 43/43 main [1]
[2] 95.1 16.34 0.00 43+2269806252 fibonacci [2]
2269806252 fibonacci [2]
-----------------------------------------------
Index by function name
[2] fibonacci [1] main
It's obvious where this program's bottleneck is—the fibonnaci() function itself. That fact should be evident from reading the source. What our profiling tells us that is not obvious from the source is the number of recursion calls that our calculation takes. Although fibonacci() is called directly from main() only 43 times, we are making approximately 2.3 billion calls as part of the recursion.
Fortunately, there's another way. Checking our reference on Fibonacci numbers, we find that there is a formula to calculate the nth Fibonacci number directly from n:
F(n) = round(Phin / sqrt(5))
I am not going to explain the mathematics behind the derivation of this formula, for the simple reason that I don't understand them myself. With this improved algorithm, however, once again I can rebuild, run and profile (Listing 4).
Listing 4. New Version of the Fibonacci Function
#define PHI 1.6180339887498948
int fibonacci(int n)
{
return (int) rint(pow(PHI, n) / sqrt(5));
}
Let's see what gprof tells us this time:
Call graph
granularity: each sample hit covers 4 byte(s) no time propagated
index % time self children called name
0.00 0.00 43/43 main [8]
[1] 0.0 0.00 0.00 43 fibonacci [1]
-----------------------------------------------
Index by function name
[1] fibonacci
Our program is now so much faster that the sampling process failed to register any time spent. Although this is a good result in a way, it's not so good if we want to see if any more performance improvements can be made.
A standard technique from the gprof manual is to run the program many times and summarize the results, like so:
[zfrey@warthog prof]$ mkdir runfib2; cd runfib2 [zfrey@warthog runfib2]$ for i in `seq 1 1000` ; \ do ../fib2 > /dev/null; mv gmon.out gmon.out.$i; done [zfrey@warthog runfib2]$ gprof -s ../fib2 gmon.out.*
The results, though, still show the time spent as 0.00. Clearly, a different approach is called for. I now create fib3.c, with two changes. First, I enclose the 0–42 loop with an additional 0–1000 loop, to increase our runtime. Second, I add a local variable to the fibonacci() function and break the calculation apart to one operation per line for line-by-line profiling. This gives us enough runtime that gprof has something to report:
Listing 5. fib3.c
#include <stdio.h>
#include <math.h>
int fibonacci(int n);
int main (int argc, char **argv)
{
int fib;
int n;
int i;
for (i = 0; i < 1000; i++) {
for (n = 0; n <= 42; n++) {
fib = fibonacci(n);
printf("fibonnaci(%d) = %d\n", n, fib);
}
}
return 0;
}
#define PHI 1.6180339887498948
int fibonacci(int n)
{
double temp;
temp = pow(PHI,n);
temp = temp / sqrt(5);
return (int) rint(temp);
}
Before doing a line-by-line report, though, I create a multirun summary, as before.
Now, let's look at the line-by-line report (trimmed to fit):
[zfrey@warthog runfib3]$ gprof -b -l -p ../fib3 gmon.sum Flat profile: Each sample counts as 0.00195312 seconds. % cumulative self self total time seconds seconds calls ps/call ps/call name 38.25 1.12 1.12 fibonacci (fib3.c:31) 27.97 1.94 0.82 fibonacci (fib3.c:30) 14.49 2.36 0.42 fibonacci (fib3.c:29) 5.91 2.53 0.17 main (fib3.c:16) 4.64 2.67 0.14 main (fib3.c:15) 3.14 2.76 0.09 fibonacci (fib3.c:32) 1.87 2.82 0.05 main (fib3.c:14) 1.54 2.86 0.04 main (fib3.c:14) 0.83 2.89 0.02 43000000 567.77 567.77 fibonacci (fib3.c:26) 0.77 2.91 0.02 main (fib3.c:21) 0.53 2.92 0.02 main (fib3.c:13) 0.07 2.93 0.00 main (fib3.c:20)
The highest percentage of time is spent on line 31, followed by line 30. There is nothing we can do about line 31, because the printf() call and the function return are essential operations. However, on line 30, notice that we are calculating the square root of 5 every time. Because this result never changes, the next optimization should be to replace it with a constant. This cuts the execution time from 136 milliseconds to 22 milliseconds.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Introduction to MapReduce with Hadoop on Linux
- Weechat, Irssi's Little Brother
- Senior Perl Developer
- Technical Support Rep
- Reply to comment | Linux Journal
2 hours 59 min ago - Reply to comment | Linux Journal
3 hours 44 min ago - Didn't read
3 hours 54 min ago - Reply to comment | Linux Journal
3 hours 59 min ago - Poul-Henning Kamp: welcome to
6 hours 9 min ago - This has already been done
6 hours 10 min ago - Reply to comment | Linux Journal
6 hours 56 min ago - Welcome to 1998
7 hours 44 min ago - notifier shortcomings
8 hours 8 min ago - heroku?
9 hours 45 min ago
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?



Comments
Is there any coverage tool
Is there any coverage tool available that only works on source files and not on executables. I mean I dont want to compile the source files even once still I want to find out the coverage in the code.
Is it possible?
Please help. I'll really appreciate your effort if you get me any information regarding the matter.
Thanks!
help needed in installing ggcov
We have tried installing ggcov
we had followed following steps.
./configure (it satisfies our pc configuration)
make
make install
It is not getting installed properly...
How can we check it is properly installed or not....?
Is there any other steps to covered...
please kindly help
Regards
savita
hai, i understood your
hai,
i understood your problem.....
me also struck with that problem but what i did was,
i just installed ggcov-06 instead of ggcov-08.....
in ggcov ,it is asking someother file which we could not get fro internet...
so better to go to install ggcov-06
thanks and regards,
Rajasekhar...
ggcov
when i try to run ggcov in SUSE LINUX 10.1 using the command line ggcov ,i do not get any error message.but the results are not displayed.
the installation had no problems.
If anybody who is familiar with ggcov can give me a detailed idea about how to use this tool,please reply.
ggcov
anybody kindly post a reply regarding how to implement ggcov
I could run ggcov successfully on the command prompt but I can't view any GUI.So please tell me the syntax/procedures of using ggcov.My linux version is SUSE 9.0
error on running kprof
when i try to make the kprof i am getting this errors..
Kindly help me to correct and run the application
parseprofile_pose.h:30: error: expected `)' before â&â token
parseprofile_pose.cpp:29: error: prototype for âCParseProfile_pose::CParseProfile_pose(QTextStream&, QPtrVector&)â does not match any in class âCParseProfile_poseâ
parseprofile_pose.h:28: error: candidates are: CParseProfile_pose::CParseProfile_pose(const CParseProfile_pose&)
parseprofile_pose.h:36: error: CParseProfile_pose::CParseProfile_pose()
need help for GCOV
I have a problem:
I'm not able to gcc -fprofile-arcs -ftest-coverage file.c
Could you let me know how I can set environemt CFLAGS variable?
I have problem when I gcc on glib files i.e garray.c
please hel--
valgrind coverage
I thought I'd add a comment to say that some people are using Valgrind to do coverage testing using the 'cachegrind' tool.
Great article, a nice clear introduction to the topic.
Thanks
Thanks very much for this informative example. I was surprised that 50% is about average for code coverage!
Re: Coverage Measurement and Profiling
Very nice article, I enjoyed it and learned new (for me) methods
thanks a lot
b.
Don't forget lcov!
lcov is a nice addition to gcof; like ggcov, it shows summaries
for multiple files. It goes beyond ggcov, I think, in that it
handles entire directory hierarchies. It can handle the Linux
kernel and Wine, for instance. It outputs HTML rather than
presenting a GUI, but that's kind of a plus in my book.
It's at ltp.sf.net/coverage/lcov.php
Re: Don't forget lcov!
Lcov does look very nice. My intent wasn't to do a comprehensive survey of tools, but to introduce the basics of gcov and gprof and give an example of a representative "add-on" for each.