The Gemcutter's Workshop: Canada on Rails

April 19th, 2006 by Pat Eyler in

Recapping another busy couple of weeks in Ruby land as well as the first international Rails conference.
Your rating: None

The past two weeks have been another busy bi-week in terms of Ruby releases and community activity. I'd like to start out with a couple of big release announcements and a mailing list posting and then move on to two big events.

News from the Community

Eric and Ryan have kept up the pace with new releases of ParseTree and ZenTest, along with a teaser about an upcoming addition to ZenTest.

Zed Shaw has been hard at work on Mongrel, punching out a couple of new releases. He's shooting for a 0.4 release quite soon now.

The Rails team also has been busy, whipping out both 1.1.1 and 1.1.2 releases.

James Gray announced that he's hit the bottom of his Ruby Quiz submission stack and asked for new submissions. A number of responses came in, and he's well stocked now for quite a while. The first quiz that appeared after the call for submissions was quite popular. I'm responsible for the next one. Hopefully, it will draw as much attention.

Finally, it's worth noting that the excellent Ruby for Rails book, by David Alan Black, now is available in PDF and should be hitting the bookstores at the beginning of May. This is an excellent book and may claim the top spot in my personal list of the best Ruby books available.

Coverity

Coverity has developed a suite of static code analysis tools for C and C++. They're currently working under a contract with the Department of Homeland Security to analyze the code bases of a number of important open-source tools. Members of the projects Coverity is working with have had good things to say about the process. And many projects are showing substantial improvement.

Ruby is a recent addition to Coverity's list. Although it's nice to see Ruby accorded that kind of respect, the addition is good in two other ways. First, it allows us to compare the Perl, Python and Ruby code bases. This point isn't really important, but it is interesting. Second, it gives the Ruby core team some targets to watch as new releases approach.

Perl and Python have been on the list longer than Ruby has, and both are showing improvement. Their original measurements are shown below:


Lang	LoC		orig defects	defect rate

Perl	485,001		89		0.185
Python	273,980		96		0.350

The next table shows the current measurements for Perl and Python, with Ruby's first (and current) measurements added.


Lang	LoC		cur defects	defect rate

Perl	485,001		67		0.138
Python	273,980		14		0.051
Ruby	258,908		30		0.116

It's pretty cool to see that the Perl and Python communities have done a good job of correcting the errors that Coverity found in the code bases. It's also interesting to see that Ruby compares well with the original Perl and Python defect rates. And, Ruby doesn't look too bad against their current defect rates either. In fact, it compares well with a lot of other projects out there, such as emacs, 0.133; gcc, 0.253; FreeBSD, 0.396; or Linux 2.6, 0.220.

Hopefully, we'll see a decrease in our defect rate over time, like most of the other projects on Coverity's report. To this end, we have a great example to follow--AMANDA. AMANDA started out with a defect rate of approximately 1.0. It currently looks like this:


Project		LoC	cur defects	defect rate

AMANDA		88,414	0		0.000

The difference is so great that a company involved in AMANDA development wrote an article about it that said, among other things:

What happened next is truly remarkable. The Amanda development community ... quickly responded to address this situation. Within one week, Amanda developers fixed the entire list of identified bugs. As it currently stands, there are 0 outstanding bugs detected by the Coverity scan.

Canada on Rails

Canada on Rails has been a big event in the Ruby community. Billed as the first international event focused on Rails, Canada on Rails has drawn a lot of attention and a lot of people. I've tried to gather up some of the coverage here.

Some notable non-Ruby names attended Canada on Rails, including Tim Bray, who wrote:

I was far from the only Rails interested-but-inexperienced poseur, there were a lot of people there to find out what it's all about. I talked to a mostly-PHP developer from Calgary and tried to convince her that Rails ought to be able to do most of what she does, only cleaner and better. On the other hand, I spent one session sitting next to a guy who has a Rails shop in New York, and was hip to the very latest YARV gossip. Mostly young, unsurprising; mostly male, sigh.

Ryan Davis kept collective notes using SubEthaEdit. Day 1 notes can be read here. My favorite comment was: "Eclipse: . . . Gateway drug for Java users."

Amy Hoy teased us with an initial post. Hopefully, more is coming soon. Alex Combas also provided excellent coverage on his blog.

Several of the speakers have posts up as well:

  • Robby Russell talked about his new acts_as_legacy project. He also blogged about Day 1 of the conference here and here.

  • Jason Voorhis talked about internationalization and posted his slides in PDF.

  • David Astels blogged about being interviewed and casually mentioned that a DVD of the conference will be available--I wonder if it will be available to non-attendees. He also discussed his talk on Behavior Driven Design

  • Thomas Fuchs let us know that he was en route to the conference. Hopefully, he'll have a retrospective post up soon.

Optimizing Ruby Code

One of the rules I find myself being more and more concerned with following is "Make it right, then make it fast". The more I work with dynamic languages such as Ruby, the easier it becomes to follow this rule and the bigger the payoff becomes for doing so. In that mind, I'd like to discuss some fundamentals for optimizing Ruby code.

Any time you optimize, you need to follow some simple steps:

  • Get the code working. You don't want to optimize broken code.

  • Profile your code. Know where the bottlenecks are so you can optimize the right parts.

  • Benchmark your code and the alternatives. Don't replace something unless it's worth it.

  • If you need to, go to another language for speed. This is your last resort.

I'm not going to spend any time here talking about the first step. Hopefully, you've already got a handle on it. If not, refer to my last two articles, found here and here. They both talk about Test First programming and related topics.

Moving on to the second step, the Ruby profiler is easy to use, but it runs much more slowly than Ruby itself. To profile a program, simply do:


$ ruby -rprofile yourprog

This command produces a report that looks something like this trimmed version:


  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 15.00     0.15      0.15       45     3.33    65.33  Kernel.require
 14.00     0.29      0.14      532     0.26     0.45  Gem::Specification#copy_
  6.00     0.35      0.06      438     0.14     0.16  Kernel.dup
  6.00     0.41      0.06       74     0.81    15.27  Array#each
  4.00     0.45      0.04      226     0.18     0.27  String#gsub!
  3.00     0.48      0.03       82     0.37     0.37  String#gsub

The meaning of each column is as follows:

  • % time: the percentage of total time spent in this method.

  • cumulative seconds: the total number of running seconds in this and all previous methods.

  • self seconds: the number of seconds spent in this method.

  • calls: the number of times this method was called.

  • self ms/call: the time spent in this method per call.

  • total ms/call: the total time spent in this method or in methods it calls.

  • name: the name of the method.

As you profile code, you will see a lot of methods that you can't do much about, such as Kernel.dup. You'll also see some that are more fruitful for you to pursue.

Benchmarking different options is at the heart of optimizing. Fortunately, it's easy to do and the output is easy to read. Here's a quick example that benchmarks different kinds of iterators and looping in Ruby:


require 'benchmark'

n = 10_000_000

Benchmark.bm(15) do |x|
  x.report("for loop:") { for i in 1..n; a = "1"; end }
  x.report("times:") { n.times do ; a = "1"; end }
  x.report("upto:") { 1.upto(n) do ; a = "1"; end }
end

Running this code generates a report like this one:


                     user     system      total        real
for loop:        3.060000   0.000000   3.060000 (  3.137070)
times:           3.290000   0.000000   3.290000 (  3.308736)
upto:            3.370000   0.000000   3.370000 (  3.372559)

This report shows that if speed matters, you probably want to use a for loop, although it won't make a huge difference. Choosing the right algorithm for the right method usually is where you get your biggest win, so spend your time on profiling and benchmarking.

If you absolutely have to go to another language, Ruby has a very clean interface for writing and using C extensions. But even it probably is too much work when you could use RubyInLine instead. RubyInline allows you to write C code within your Ruby program. This code is compiled and linked to your program, potentially representing a huge speed increase. Ryan's documentation shows a 4x speed up between:


  def factorial(n)
    f = 1
    n.downto(2) { |x| f *= x }
    f
  end

and


  inline do |builder|
    builder.c "
    long factorial_c(int max) {
      int i=max, result=1;
      while (i >= 2) { result *= i--; }
      return result;
    }"
  end

If you've gotten all you can out of choosing your algorithms well, this might be your last, best hope.

__________________________

--
-pate
http://on-ruby.blogspot.com


Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Sean Carley's picture

Performance Unit Tests

On April 21st, 2006 Sean Carley (not verified) says:

You could make performance into a unit test using bm. Then as you optimize, you could tell when you had done enough. Also, if you added something later that "broke" performance, you would know immediately.

zenspider's picture

Too subjective and too prone to error.

On April 22nd, 2006 zenspider (not verified) says:

Too subjective and too prone to error.

What I do is I actually run my profile runs against my unit tests. Assuming I have good coverage then the results aren't a total farce and any optimizations I do directly affect my feedback loop.

zenspider's picture

Too subjective and too prone

On April 22nd, 2006 zenspider (not verified) says:

Too subjective and too prone to error.

What I do is I actually run my profile runs against my unit tests. Assuming I have good coverage then the results aren't a total farce and any optimizations I do directly affect my feedback loop.

pate's picture

Good Idea

On April 21st, 2006 pate (not verified) says:

You'd want to keep performance tests separate from your regular unit tests though (or build a tricky way to track the base performance on a given system). There are so many things that can affect the numbers outside the script. Hardware, OS, potentially even the version of Ruby (what if I run it on a YARV enabled build?).

That sounds like a pretty interesting tool to build though.

paddy3118's picture

Optimising helped by, (unit), tests

On April 19th, 2006 paddy3118 (not verified) says:

Good tests developed with the working program pay dividends when optimising as they can quickly show when an optimisation change makes the program fail.
The other four bulleted points were great (and apply to other languages too).
- Pad.

pate's picture

Good Point

On April 19th, 2006 pate (not verified) says:

This is true. Any time you want to make changes to a program without changing it's functional behavior, Unit Tests are the way to go.

I've talked about unit testing before (both in this column and elsewhere). I'll be talking about it again in the context of refactoring soon.

-pate

Jack Diederich's picture

Coverity false positives

On April 19th, 2006 Jack Diederich (not verified) says:

Coverity errs on the side of caution. If it isn't sure that a pointer can't be NULL before use it will mark it as problematic. A developer who _is_ sure it can't be NULL will then mark the report as NOT_A_BUG. False positives are one reason why the perl & python counts dropped dramatically after the initial results. Some real bugs were fixed too, of course.

Post new comment

Please note that comments may not appear immediately, so there is no need to repost your comment.
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.

More information about formatting options

Newsletter

Each week Linux Journal editors will tell you what's hot in the world of Linux. You will receive late breaking news, technical tips and tricks, and links to in-depth stories featured on www.linuxjournal.com.
Sign up for our Email Newsletter

Tech Tip Videos

From the Magazine

July 2009, #183

News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.


To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .


All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.





Read this issue