PracTcl Programming Tips
Analyzing the profile data is a bit tricky. The time attributed to a particular procedure consists of not only time spent in that procedure, but the time spent in all of its children (“called” procedures) plus the time taken by the profiling code that times the child procedures.
We can approximate the time spent in the profiling code by running a procedure with—then without—the profiling code and computing the difference. The procedure profile_calibrate does just that.
proc profile_calibrate {}
global Profile_fd
proc profile_dummy {args} {return $args}
set Profile_fd [open /tmp/[pid] w]
time profile_dummy 10
set before [lindex [time profile_dummy 10] 0]
profile_proc profile_dummy<\n>
set after [lindex [time profile_dummy 10] 0]
close $Profile_fd
rename profile_dummy {}
rename profile_dummy:OLD {}
return [expr ($after - $before)]
}
A dummy procedure profile_dummy is created and timed. Then profile_proc is called to add the profiling code, and profile_dummy is timed again. The result is an approximation of the timing overhead.
proc Incr {name {value 1}} {
upvar $name var
if {[info exists var]} {
set var [expr $var + $value]
} else {
set var $value
}
}
The profile_summarize procedure uses a souped-up version of the Tcl incr command (called Incr that permits incring a variable before it is set by automatically initializing it to zero before incrementing it.
proc profile_summarize {file} {
puts [format "%35s calls ms ms/call %%" \
name]
set fd [open $file r]
set total 0
set overhead [profile_calibrate]
# read in the data, and accumulate the values
while {[gets $fd line] > 0} {
set name [lindex $line 0]
set time [lindex $line 1]
set parent [lindex $line 2]
Incr count($name)
Incr sum($name) $time
if {$parent != "TopLevel"} {
Incr sum($parent) \
"- ($time + $overhead)"
} else {
Incr total $time
}
}
close $fd
# sort and print the results
foreach name [lsort [array names count]] {
if {$count($name) == 0} continue
set ms [expr $sum($name)/1000]
puts [format "%35s %4d %7d %6d %4.1f%%" \
$name $count($name) $ms \
[expr $ms / $count($name)] \
[expr $sum($name) * 100.0 / $total]]
}
}
After a bit of initialization, Profile_summarize works in two stages, reading and tabulating the timing information, then sorting and printing the results.
Each line of timing information is read into three variables: the procedure name, the elapsed time in µs, and the parent, or calling, procedure name. Two arrays, count and sum, both indexed by procedure name, are used to keep track of the number of times each procedure is called and the accumulated time for each procedure. Next, the time attributed to each procedure is subtracted from the time credited to its parent, along with the timing overhead. Only procedures that are called from the top level have their times included in the total time counter. Otherwise, the time would be counted twice, once for the procedure and again for its caller.
Once all of the data is tabulated, it is sorted by procedure name. The results are printed in the form of procedure name, number of times the procedure was called, the total elapsed time spent in the procedure, and the percentage of the total program execution time spent in the procedure. The time values are divided by 1000 to convert them from µs into ms. Note that after about 35 minutes, the number of µs overflows a 32 bit integer, so profiling runs shouldn't be much longer than a half hour.
Listing 1 at the bottom of this article shows a subset of the results of a profiling run for an HTML library package. For this test case, it's clear that HMrender accounts for a proportionally large share of the total running time of the program. Even if it could be recoded to run infinitely fast, the overall application would be sped up by, at most, 27%. The negative time attributed to HMtag_a is probably due to the variability of the profiling overhead calculation.
The data in the logging file looks like:
... HMrender 152083 HMparse_html HMmap_esc 553 HMrender HMzap_white 629 HMrender HMx_font 3056 HMcurrent_tags HMset_font 2022 HMcurrent_tags HMcurrent_tags 14424 HMrender ...
Even though there can be considerable variability in the elapsed times of procedures, depending upon the current load on the processor, these simple profiling routines can quickly point out those parts of an application that are consuming most of the running time and would be good candidates for optimization.
Stephen Uhler is a researcher at Sun Microsystems Laboratories, where he works with John Ousterhout improving Tcl and Tk. Stephen is the author of the MGR window system and of PhoneStation, a TCL-based personal telephony environment. He may be reached via e-mail at Stephen.Uhler@Eng.Sun.COM.
Listing 1: Sample Run
% source sample.tcl Starting sample HTML viewer... % source ~/column/profile/test % profile_start [Run the application for a bit] % profile_stop
HMextract_param
name | calls | ms | ms/call | % |
HMcurrent_tags | 465 | 1478 | 3 | 3.2% |
HMdo_map | 12 | 3 | 0 | 0.0% |
186 | 773 | 4 | 1.7% | |
HMgot_image | 12 | 1199 | 99 | 2.6% |
HMlink_setup | 25 | 59 | 2 | 0.1% |
HMmap_esc | 467 | 307 | 0 | 0.7% |
HMparse_html | 3 | 2970 | 990 | 6.4% |
HMrender | 453 | 12544 | 27 | 27.0% |
HMreset_win | 3 | 220 | 73 | 0.5% |
HMset_font | 465 | 6292 | 13 | 13.5% |
HMset_image | 12 | 1401 | 116 | 3.0% |
HMset_state | 3 | 5 | 1 | 0.0% |
HMstack | 306 | 295 | 0 | 0.6% |
HMtag_/a | 33 | 124 | 3 | 0.3% |
HMtag_a | 33 | -13 | -1 | -0.0% |
... |
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




33 min 2 sec ago
6 hours 19 min ago
6 hours 36 min ago
8 hours 29 min ago
10 hours 22 min ago
17 hours 16 min ago
17 hours 33 min ago
19 hours 24 min ago
1 day 1 hour ago
1 day 5 hours ago