At the Forge - Profiling Rails Applications

 in
Wondering if your Rails application is running at peak efficiency? Before optimizing, profile your application to see which parts are slow.

I am writing this article in mid-May 2008, several weeks after Twitter was rumored to be moving to a platform other than Ruby on Rails. Twitter, of course, is an extremely popular service that allows users to write updates and notes about their current status—and allows readers to follow any number of people's Twitter feeds. You can think of Twitter as a combination blogging and RSS platform, populated by people who express themselves with only 140 characters at a time.

Like many other runaway Internet successes, Twitter appears to have become too popular for its own good. This has led to some outages, most notably one at the beginning of 2008 that took more than a day to restore. Thus, it was seen as more than a mere coincidence when Twitter's main architect left the company, and that within a few days, the TechCrunch blog was quoting anonymous officials within Twitter about how the service would be transitioning away from Ruby on Rails.

This was followed by a great deal of discussion over whether Rails is a “scalable” architecture. Scalable used to mean that it was possible to scale up applications using a Web site, almost regardless of how many people are using it. But today, a scalable architecture is one that's lean and mean, handling as many users as possible with as few servers as possible. PHP, Java and .NET are pretty universally considered to be scalable in this sense. Although even the most efficient PHP application can handle only a finite number of simultaneous users, it's undeniable that Ruby is a slower language than PHP, and that the Rails framework adds some more overhead.

Of course, it's one thing to say that Rails doesn't scale as quickly as PHP, and another to say that it doesn't scale at all. And, there are other arguments to be made, including the fact that programmers cost more than servers, and that programmer productivity should be at least as significant a factor as scalability.

That said, it's easy for a Rails application to become slow. So, it is nice to know that a variety of utilities can be used to profile Rails applications—meaning, finding out exactly which portion of the program is taking a long time to execute. This month, we look at some techniques for profiling Rails applications. Although such profiling doesn't make the software run any faster, it can help identify the slowest parts of an application.

Profiling Pages

If you aren't happy with the performance of your Web site—and quite frankly, you always should be concerned about the performance, trying to give it a boost wherever possible—the first question to ask is, “Where are people spending their time?” After all, if there are 100 different pages on your site, it doesn't really matter whether page 35 is really slow if no one ever visits it.

The first tool to examine, the production log analyzer, is designed to look at the Rails production log and produce some basic statistics about it. The production log, as well as the development and test logs, are typically stored in the log directory under the Rails project root. Thus, on a production server, the log is in log/production.log. This logfile is not rotated or modified automatically; there are clearly a number of ways to do that using cron and other UNIX command-line tools.

The thing is, there already is a facility on UNIX (and Linux) systems for handling logfiles, including their periodic rotation and disposal. This facility is known as syslog, which makes it possible to send logging information to a variety of different files based on priorities and source materials. The /var/log directory on my Ubuntu server is full of different logfiles, and nearly all of them were created and written to by syslog.

It turns out that we can use syslog for our Rails production logs. Once we have done that—and yes, we must use syslog for this to work—we then can analyze our production logs, learning exactly how much time people have spent in our Rails application.

To move your Rails production log to syslog, you need to do several things. First, you must install the Ruby gem that provides this behavior:

gem install --remote SyslogLogger

This installs the gem in the appropriate place on your system; on mine, it was put into /usr/lib/ruby/gems/1.8/gem. Next, you need to add the following to one or more of your environment configuration files (either environment.rb or one or more files in environments/*.rb) for your Rails system:

require 'syslog_logger'
RAILS_DEFAULT_LOGGER = SyslogLogger.new

This, of course, loads the syslog_logger gem and sets the default logger to a new instance of SyslogLogger.

Now that you have told Rails to use syslog, you must tell syslog what to do with the files that come from Rails. I opened /etc/syslog.conf and added the following lines to the bottom:

*.info             /var/log/production.log

And yes, the documentation system says that you can use a !rails tag before this line, or one like it, to restrict logging to messages coming from Rails. Unfortunately, this syntax does not appear to be supported by Linux. So, this means production.log will include messages from other programs and facilities, not just Rails. That shouldn't concern us right now, although it might be an issue on a busy machine with many services in active use.

Once you have modified syslog.conf in this way, you can restart syslog.conf. Almost immediately, your production log should be stored to /var/log/production.log. You can check this, of course, with the following:

tail -f /var/log/production.log

Now, this logfile is similar in many ways to the logfile you just eliminated from the log directory in your application root. However, it is formatted in such a way that the production log analyzer will be able to find and perform calculations based on its output. To analyze the logfile, type:

pl_analyze /var/log/production_log

If you prefer to have the results sent to you via e-mail, rather than stored to a disk file, use the -e option:

pl_analyze /var/log/production_log -e reuven@lerner.co.il

This option is particularly useful when you invoke pl_analyze from a cron job, for example.

The output file from pl_analyze is divided into three parts:

  • Time spent in each request.

  • Time spent in the database for each request.

  • Time spent rendering the output from each request.

For each controller action, pl_request lists how many times it was invoked, as well as the average time it took to execute. It also gives the min, max and standard deviation, allowing you to see how much the execution time varied over time.

Thus, the production log analyzer shows which actions take the greatest amount of time overall, which take the greatest amount of time in the database (or to render) and how many times each was invoked.

I have found pl_analyzer to be an indispensable tool when trying to determine whether a site is fast enough and where I should focus my attention to improve its speed.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Another tool to analyze your production logs

Willem van Bergen's picture

You may also want to have a look at request-log-analyzer, a tool I wrote. This command line application parses Rails production log files to produce a performance report. It also works for Merb log files, Rails development log files and can be easily extended to other log file formats. See the project's wiki on Github for more info.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix