PracTcl Programming Tips

This month, find out how to test the speed of your Tcl programs and zero in on the slow parts with the time command.
Analyzing the results

Analyzing the profile data is a bit tricky. The time attributed to a particular procedure consists of not only time spent in that procedure, but the time spent in all of its children (“called” procedures) plus the time taken by the profiling code that times the child procedures.

We can approximate the time spent in the profiling code by running a procedure with—then without—the profiling code and computing the difference. The procedure profile_calibrate does just that.

proc profile_calibrate {}
    global Profile_fd
    proc profile_dummy {args} {return $args}
    set Profile_fd [open /tmp/[pid] w]
    time profile_dummy 10
    set before [lindex [time profile_dummy 10] 0]
    profile_proc profile_dummy<\n>
    set after [lindex [time profile_dummy 10] 0]
    close $Profile_fd
    rename profile_dummy {}
    rename profile_dummy:OLD {}
    return [expr ($after - $before)]
}

A dummy procedure profile_dummy is created and timed. Then profile_proc is called to add the profiling code, and profile_dummy is timed again. The result is an approximation of the timing overhead.

proc Incr {name {value 1}} {
    upvar $name var
    if {[info exists var]}  {
        set var [expr $var + $value]
    } else {
        set var $value
    }
}

The profile_summarize procedure uses a souped-up version of the Tcl incr command (called Incr that permits incring a variable before it is set by automatically initializing it to zero before incrementing it.

proc profile_summarize {file} {
    puts [format "%35s calls   ms   ms/call %%" \
          name]
    set fd [open $file r]
    set total 0
    set overhead [profile_calibrate]
    # read in the data, and accumulate the values
    while {[gets $fd line] > 0} {
        set name [lindex $line 0]
        set time [lindex $line 1]
        set parent [lindex $line 2]
        Incr count($name)
        Incr sum($name) $time
        if {$parent != "TopLevel"} {
            Incr sum($parent) \
                 "- ($time + $overhead)"
        } else {
            Incr total $time
        }
    }
    close $fd
    # sort and print the results
    foreach name [lsort [array names count]] {
        if {$count($name) == 0} continue
        set ms [expr $sum($name)/1000]
        puts [format "%35s %4d %7d %6d %4.1f%%" \
            $name $count($name)  $ms \
            [expr $ms / $count($name)] \
            [expr $sum($name) * 100.0 / $total]]
    }
}

After a bit of initialization, Profile_summarize works in two stages, reading and tabulating the timing information, then sorting and printing the results.

Each line of timing information is read into three variables: the procedure name, the elapsed time in µs, and the parent, or calling, procedure name. Two arrays, count and sum, both indexed by procedure name, are used to keep track of the number of times each procedure is called and the accumulated time for each procedure. Next, the time attributed to each procedure is subtracted from the time credited to its parent, along with the timing overhead. Only procedures that are called from the top level have their times included in the total time counter. Otherwise, the time would be counted twice, once for the procedure and again for its caller.

Once all of the data is tabulated, it is sorted by procedure name. The results are printed in the form of procedure name, number of times the procedure was called, the total elapsed time spent in the procedure, and the percentage of the total program execution time spent in the procedure. The time values are divided by 1000 to convert them from µs into ms. Note that after about 35 minutes, the number of µs overflows a 32 bit integer, so profiling runs shouldn't be much longer than a half hour.

Listing 1 at the bottom of this article shows a subset of the results of a profiling run for an HTML library package. For this test case, it's clear that HMrender accounts for a proportionally large share of the total running time of the program. Even if it could be recoded to run infinitely fast, the overall application would be sped up by, at most, 27%. The negative time attributed to HMtag_a is probably due to the variability of the profiling overhead calculation.

The data in the logging file looks like:

...
HMrender 152083 HMparse_html
HMmap_esc 553 HMrender
HMzap_white 629 HMrender
HMx_font 3056 HMcurrent_tags
HMset_font 2022 HMcurrent_tags
HMcurrent_tags 14424 HMrender
...
Conclusions

Even though there can be considerable variability in the elapsed times of procedures, depending upon the current load on the processor, these simple profiling routines can quickly point out those parts of an application that are consuming most of the running time and would be good candidates for optimization.

Stephen Uhler is a researcher at Sun Microsystems Laboratories, where he works with John Ousterhout improving Tcl and Tk. Stephen is the author of the MGR window system and of PhoneStation, a TCL-based personal telephony environment. He may be reached via e-mail at Stephen.Uhler@Eng.Sun.COM.

Listing 1: Sample Run

% source sample.tcl
Starting sample HTML viewer...
% source ~/column/profile/test
% profile_start
  [Run the application for a bit]
% profile_stop

HMextract_param

name

calls

ms

ms/call

%

HMcurrent_tags

465

1478

3

3.2%

HMdo_map

12

3

0

0.0%

186

773

4

1.7%

 

HMgot_image

12

1199

99

2.6%

HMlink_setup

25

59

2

0.1%

HMmap_esc

467

307

0

0.7%

HMparse_html

3

2970

990

6.4%

HMrender

453

12544

27

27.0%

HMreset_win

3

220

73

0.5%

HMset_font

465

6292

13

13.5%

HMset_image

12

1401

116

3.0%

HMset_state

3

5

1

0.0%

HMstack

306

295

0

0.6%

HMtag_/a

33

124

3

0.3%

HMtag_a

33

-13

-1

-0.0%

  

...

  
______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix