PHP Performance Profiling
Due to the incredible growth of
PHP in the last couple of
years, it's now being used for tasks ranging from tiny scripts to
large-scale Web applications. Some Web applications contain
hundreds of thousands of lines of PHP code, and the fact that PHP
can scale to these levels is a great testament to its design and
the efficient Zend Engine
that actually manages PHP code execution.Of course, bigger and more complex projects result in more
load on your servers, and when you throw a database into the mix
you have even more potential performance bottlenecks to track. A
typical scenario might be you've added a few new features to a Web
application and now are seeing more server load and memory usage;
thus, pages seem to load slower. What can you do? Maybe you can
afford to throw bigger hardware at the problem, but even if that's
a viable option, you also should find the parts of your code that
are causing the slowdowns and optimize them.A number of factors can affect the performance of a Web
application, including Web server configuration, database
performance, data structure, the application design and the
implementation of the application. I'm going to assume you've
already taken care of the first items, and now you want to iron out
bottlenecks in your application implementation, that is, in the
actual code. But how do you find the bottlenecks in the first
place?The answer is a technique known as performance profiling.
Performance profiling runs your code in a controlled environment
and returns a report listing such statistics as time spent within
each function, how long each database query takes and how much
memory has been used.By doing performance profiling on your code, you quickly can
see where you may be wasting time with slow database queries or
inefficient code. Having this information then allows you to spend
your time tuning PHP and SQL where it needs it most. No more
guessing what's going on internally: performance profiling gives
you hard figures.Profiling ToolsA number of different tools have been developed to help with
PHP performance profiling, including
Benchmark (a
PEAR project), DBG,
Xdebug and
Advanced PHP Debugger
(another PEAR project).If you're really serious about squeezing every last cycle out
of the code, you should investigate all the benchmarking tools you
can find, because they work in different ways and allow you to
extract different kinds of information. For now, however, I'm going
to concentrate on APD, the Advanced PHP Debugger.APD is a debugger written in C by George Schlossnagle and
Daniel Cowgill that loads as an extension to the Zend Engine. It
works by hooking into the Zend internals and intercepting PHP
function calls, allowing it to do things like measure function
execution time, count function calls, perform stack backtraces and
other funky things.Installing APDCurrently, three main ways exist to install APD on a Linux
system: grab the source and compile it yourself, use PEAR or use
the Debian package. The latest source always is available from
the APD Web site.
Building and installing it isn't a hard process, but you need to
make sure the various PHP development resources are installed on
your system. For example, you need the PHP C headers, as well as a
program called phpize that is used to prepare the package as a Zend
extension. If you decide to go that route, make sure you follow the
instructions in the README included with the source.If you use PEAR, included with PHP4.3+, you can install PHP
modules with minimal fuss. Once again, the full instructions are
available on the APD Web site, which is part of the PEAR project.
Assuming PEAR support is included in your version of PHP, getting
it going should be as simple as typing pear install
apd and answering a few questions.Finally, for Debian users I maintain a .deb package of the
latest version of APD. It's much too recent to be in Woody, but at
the time of this writing it's in Sid (current Unstable) and should
enter Sarge (current Testing) soon. You should be able to use
apt-get install php4-apd to have everything done
for you.Whatever installation method you use, you should have the CGI
version of PHP installed, because some of the command-line tools
included with APD are written in PHP and need the parser to run.
Personally, I run my Web servers with the Apache module version of
PHP because it's much faster, but that doesn't matter. Simply
install the CGI version of PHP as well and away you go. It doesn't
need to affect your Apache module installation of PHP; they can
live side by side quite happily.You can use the phpinfo() function to confirm that APD
installed and loaded properly. Create a file in your Web root that
calls phpinfo(), and open it in a browser. The quickest way to do
that probably is to type
echo '<?php phpinfo() ?>' > info.php
in your Web root. When you access the file in your browser
through your Web server (not by directly opening the file), it
lists all the extensions that PHP has loaded. You should see APD
listed somewhere on the page. If that worked fine, you're ready for
the next step.Your First TestAt this point you should have APD installed on your server,
along with all the other usual stuff--Apache, MySQL or Postgres,
your PHP scripts themselves, and whatever else you need to run your
Web application.Now, pick a PHP script in your Web application for a test
run, and open it in an editor. In my example here, I use the file
src/webmail.php from
Squirrelmail, a
popular Web mail system written in PHP. The basic procedure is to
invoke special APD functions inside your PHP script that tell APD
to do its stuff. Right at the top of your PHP script, put in a call
to apd_set_pprof_trace(), like this:
<?php
apd_set_pprof_trace();
...etc
What this does is tell APD to start a trace on execution of
your script at that point and dump it out to disk in a predefined
location, which is set in the php.ini file using the apd.dumpdir
directive. If you used my Debian package, the location should be
set to /var/log/php4-apd/; if you used PEAR or compiled from
source, please check the included documentation.Gathering Some DataNow that your script is set up and ready to profile, load it
in a Web browser. It should run exactly as before, and you
shouldn't see any difference at all from the client side; the page
loads as it always has.What is different this time is a pprof tracefile has been
written out to the dumpdir previously defined. A pprof tracefile is
a text file that contains a machine-parsable summary of how your
PHP was processed, and it is named something like pprof.25802. The
number is the process ID of the Web server process that handled the
request. You can take a quick look through the tracefile, but at
this point it won't mean much to you. For my example using
Squirrelmail, it looks something like this:
#Pprof [APD] v0.9
hz=100
caller=/jade/webserver/site13/web/squirrelmail-1.2.5/src/webmail.php
END_HEADER
! 1 /jade/webserver/site13/web/squirrelmail-1.2.5/src/webmail.php
& 1 main 2
+ 1 1 2
- 2 98816
! 2 /jade/webserver/site13/web/squirrelmail-1.2.5/functions/strings.php
& 3 require_once 2
+ 3 2 16
& 4 php_self 2
+ 4 2 597
@ 1 0 1
- 4 244352
- 3 244368
! 3 /jade/webserver/site13/web/squirrelmail-1.2.5/config/config.php
+ 3 3 17
- 3 291224
... etc
and so on for several more pages. Essentially it's a
step-by-step record of what the Zend engine did as it processed
your script, but the format is not designed for direct human
consumption. Instead, it's an intermediate log that can be
processed to generate nice reports.Interpreting The pprof TracefileAPD comes with a little shell script written in PHP, called
pprofp, that can be run from the command line to parse the pprof
tracefile and give you a human-readable report. To use it, run
pprofp and pass it an option and the path of the tracefile, like
this:
pprofp -u /var/log/php-apd/pprof.25802
This command prints out a function call summary that should
look something like this:
Trace for /jade/webserver/site13/web/squirrelmail-1.2.5/src/webmail.php
Total Elapsed Time = 0.15
Total System Time = 0.02
Total User Time = 0.13
<pre>
Real User System secs/ cumm
%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name
---------------------------------------------------------------------------------------------------------------
30.8 0.05 0.12 0.04 0.11 0.01 0.01 22 0.0018 0.0050 859752 require_once
15.4 0.02 0.02 0.02 0.02 0.00 0.00 8 0.0025 0.0025 403200 register_attachment_common
15.4 0.02 0.02 0.02 0.02 0.00 0.00 23 0.0009 0.0009 298360 define
7.7 0.01 0.01 0.01 0.01 0.00 0.00 1 0.0100 0.0100 145536 php_self
7.7 0.01 0.01 0.01 0.01 0.00 0.00 7 0.0014 0.0014 342368 function_exists
7.7 0.02 0.02 0.01 0.01 0.01 0.01 1 0.0100 0.0100 457224 session_start
7.7 0.01 0.01 0.01 0.01 0.00 0.00 4 0.0025 0.0025 968 cacheprefvalues
7.7 0.01 0.01 0.01 0.01 0.00 0.00 1 0.0100 0.0100 163952 is_array
0.0 0.00 0.01 0.00 0.01 0.00 0.00 4 0.0000 0.0025 64 getpref
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 16 is_logged_in
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 -248 ereg
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 16 set_up_language
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 72 ini_get
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 232 setlocale
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 256 header
0.0 0.00 0.00 0.00 0.00 0.00 0.00 3 0.0000 0.0000 296 putenv
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 96 getenv
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 56 textdomain
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 424 bindtextdomain
0.0 0.00 0.00 0.00 0.00 0.00 0.00 1 0.0000 0.0000 56 do_hook
The report above shows time and memory usage on a
per-function basis, sorted by user-time as directed by the -u
option. The first few columns are execution time in seconds. The
Calls column is a count of the number of times that function was
executed by the script. secs/call is the average execution time of
each call to that function, while cumm s/call is the cumulative
time spent on that function. Then it lists memory usage and finally
the name of the function itself. Notice that function call reports
are truncated to 15 functions by default.If your script has some major performance problems, here is
where they should start to become glaringly obvious. Do you have a
slow function that's called many times? Better take a close look at
it. Are you doing lots of SQL queries? You can see them here right
away.pprofp OptionsThe basic report shown above is useful for getting an
overview of the script's execution, but you also can pass other
options to pprofp that tell it to format the report in different
ways. Simply call it using pprofp <option>
<tracefile>, using one or more of these
options:Sort Options-a: sort by alphabetic names of subroutines.-l: sort by number of calls to subroutines-m: sort by memory used in a function call.-r: sort by real time spent in subroutines.-R: sort by real time spent in subroutines (inclusive of
child calls).-s: sort by system time spent in subroutines.-S: sort by system time spent in subroutines (inclusive of
child calls).-u: sort by user time spent in subroutines.-U: sort by user time spent in subroutines (inclusive of
child calls).-v: sort by average amount of time spent in
subroutines.-z: sort by user+system time spent in subroutines.
(default)Display Options-c: display real time elapsed alongside call tree.-i: suppress reporting for PHP built-in functions-O <cnt>: specify maximum number of subroutines to
display. (default 15)-t: display compressed call tree.-T: display uncompressed call tree.If you have a lot of functions (subroutines) in your script,
it may be helpful to sort by number of calls (pprofp -l
<tracefile>) or by memory used (pprofp -m
<tracefile>) to see quickly where the bottlenecks
are.Function Call TreeAlthough the function call report probably is the most
immediately useful one for finding bottlenecks, one of the funkiest
options in the pprofp script is the ability to output a function
call tree. A function call tree essentially is a step-by-step list
of each executed function, indented to show function
nesting.You can output a function call tree with the -t or -T
options, like this:
pprofp -t /var/log/php4-apd/pprof.15507
which output something like this for my Squirrelmail
example:
main
require_once
php_self
require_once (2x)
session_is_registered
require_once
require_once
require_once (3x)
require_once (2x)
require_once
require_once
require_once
require_once
is_array
use_plugin
file_exists
include_once
function_exists
... etc
As you can see, a require_once was performed and inside that
a php_self was executed. Then another require_once executed
session_is_registered, followed by another require_once and so on.
Basically, the function call tree is like a little window into the
Zend Engine, allowing you to watch the sequence of events that take
place when your Web application is run.If you do much object-oriented development, you may find that
you rapidly lose track of what's actually going on inside some
classes and methods. It's tempting to think of classes as a black
box, because that's how we're taught to use them. But, when
optimizing a complex Web application, you need to know what's
actually going on inside each one, or you may have performance
bottlenecks you do not even notice.APD Function ReferenceAPD provides quite a number of functions that you can use to
help profile and debug your code. Experiment with these to see the
gory internal details of what the Zend Engine is doing with your
script, but note that some of them now are deprecated for
PHP4.3+:
- apd_set_pprof_trace():
the most useful APD function as far as profiling is concerned, this
dumps a tracefile named pprof.<pid> in your apd.dumpdir. The
tracefile is a machine-parsable output file that can be processed
with the pprofp <tracefile> command. - apd_set_session_trace(N):
similar to apd_set_pprof_trace(), it dumps a human-readable session
trace named apd_dump_<pid> in your apd.dumpdir. This is the
old way of doing things, noted here because it still works (for
now). It's been deprecated, so it's better to use a pproftrace
instead. N is an integer that sets the items to be traced; for now,
use a value of 99 to turn on all implemented options. - array
apd_callstack(): returns the current call stack at that
stage of execution as an array. - apd_cluck([string
warning[,string line delimiter]): behaves like Perl's
Carp::cluck module. Throws a warning and a callstack. The default
line delimiter is <BR />\n. This function is deprecated for
users of PHP4.3+; use the internal debug_backtrace() and
debug_print_backtrace()instead. - apd_croak([string error[,
string line delimiter]]): behaves like Perl's
Carp::croak module. Throws an error, a callstack and then exits.
The default line delimiter is <BR />\n'. This function is
deprecated for users of PHP4.3+; use the internal debug_backtrace()
and debug_print_back-trace() instead. - array
apd_dump_regular_resources(): returns all current
regular resources as an array. - array
apd_dump_persistent_resources(): returns all persistent
resources as an array. - override_function(string
func_name, string func_args, string func_code): syntax
is similar to create_function(), overrides built-in functions
(replaces them in the symbol table). - rename_function(string
orig_name, string new_name): renames orig_name to
new_name in the global function_table. Useful for temporarily
overriding built-in functions.
Profiling A Live SiteTo get the best understanding of how your PHP implementation
runs under real conditions, you should do the profiling in an
environment as close as possible to the real thing. Real data and
many real or simulated client connections are helpful. You can
create this environment by making a duplicate of your site and
running HTTP benchmarking software or even using a couple of shell
scripts and Wget to simulate user loads. Alternatively, you can do
the profiling right on your live site.As you've seen, using the profiling features of APD in your
scripts causes the Web server to write out text files for every
script access. That's obviously a very bad thing if you're trying
to profile a live site, because data is dumped out faster than you
can deal with it. Besides, it causes a slowdown on the server if
your site has a lot of visitors--and if it didn't, you wouldn't be
profiling it, would you? A useful little trick to get around the
problem is to activate profiling only for page requests coming from
your specific IP address. Doing so allows you to browse the live
site and produce profile information without any action being taken
on requests by other users.This situation actually is pretty trivial to implement.
Simply wrap the invocation of apd_set_pprof_trace() in a check for
the requesting IP address, as shown below. Then, only your
nominated client addresses cause the script to go into profiling
mode:
<?php
$DEBUGIPS = array('203.222.90.204','192.168.0.23');
if(array_search($_SERVER[REMOTE_IP], $DEBUGIPS)) {
apd_set_pprof_trace();
}
?>
Where to Go from Here?What now? Optimize your code, of course. I'm not going to
tell you how to do that here, because it's way beyond the scope of
this short article. But, at least now you have some insight into
the internal workings of your application and won't be wasting time
optimizing parts of your code that aren't really a problem. And
when you do make changes, you have a benchmark against which you
can test your code to see how the changes affect performance and
efficiency.Jonathan Oxer is the founder
and Technical Director of
Internet Vision
Technologies, an Australian Web application development
agency with clients around the world. He also is a Debian
developer, current maintainer of apt-cacher and was organizer of
the Debian Mini-Conf in Perth in January 2003 in association with
Linux Conf Australia, where he also presented one of the technical
papers. His first book, How to Hire a Web Developer and
Stay Sane, will be published soon. His second book,
The Debian Universe is being written live
on-line, and his third and fourth books (How to Deploy
Web Applications and Stay Sane and Disaster
Proofing for Small Networks) already are
underway.
email: jon@ivt.com.au










This week 5 lucky Members will receive a copy of The Official Ubuntu Server Book by Benjamin Mako Hill and Linux Journal's very own Kyle Rankin. No entry necessary. Check back here early next week to find out who the lucky Online Members are.




Comments
XDebug
Great article! APD is a great way to have a local and accurate profiling of your code. That is very useful in large applications since you can concentrate on a specific subsection of the code for optimization without being distracted by other stuff.
However, many times I find that using XDebug+Valgrind is a very good idea, because it can give you a good picture of how your entire application behaves and what are its most expensive (or critical) modules. Also the XDebug method is also the most unobtrusive I think (you only need very minor changes in your code for this to work).
apd not working from within apache server
HI, I am a newbie in APD, require your help to profile our PHP application. I have installed APD 1.0.1 on a Linux box and I am able to profile sample php scripts by firing them from shell as “php samplescript.php
Re: PHP Performance Profiling
Another nice way to analyze the trace file is by doing
$ pprof2calltree -f [pprof tracefile]
and then open the resulting file with KCachgrind
installation of APD not so easy
Installation of APD on my Debian system doesn't turn out to be so easy.
I just upgraded to Apache 2.2.0 and PHP5. I tried to install with the pear command, which gave me the following:
$ pear install apd
PHP version >= 5.0.0RC3-dev is required
apd: Dependencies failed
I also tried with:
$ apt-get install php5-apd
But php5-apd doesn't exist so I installed php4-apd which doesn't work.
Any idea's?
RE: installation of APD not so easy
I had the same problem installing apd on my gentoo machine.
It won't help you, but after the problems I had, I found out the for Gentoo apd is in portage (pecl-apd)
There might be some dependency problems, but once you fix those, apd works.
APD is a pecl package
I found out that to install apd on PHP 5 you need to use
pecl install apdcommand and notpear install apdapd install errors
has anyone solved this problem yet? tried pecl install apd, but no pecl command available. tried to install pecl, but instructions are too ambiguous -- Does apd REALLY require php v5.0 or is this a bug in the installation software?
After using "pecl install
After using "pecl install apd" and putting apd.so where php.ini wanted it there were a few more php.ini directives to add before the module showed up on my phpinfo page. See:
http://us.php.net/apd
Check the link
I managed to install efter looking trough this page: http://xdebug.org/install.php
Post new comment