The Gemcutter's Workshop: Canada on Rails
April 19th, 2006 by Pat Eyler in
The past two weeks have been another busy bi-week in terms of Ruby releases and community activity. I'd like to start out with a couple of big release announcements and a mailing list posting and then move on to two big events.
Eric and Ryan have kept up the pace with new releases of ParseTree and ZenTest, along with a teaser about an upcoming addition to ZenTest.
Zed Shaw has been hard at work on Mongrel, punching out a couple of new releases. He's shooting for a 0.4 release quite soon now.
The Rails team also has been busy, whipping out both 1.1.1 and 1.1.2 releases.
James Gray announced that he's hit the bottom of his Ruby Quiz submission stack and asked for new submissions. A number of responses came in, and he's well stocked now for quite a while. The first quiz that appeared after the call for submissions was quite popular. I'm responsible for the next one. Hopefully, it will draw as much attention.
Finally, it's worth noting that the excellent Ruby for Rails book, by David Alan Black, now is available in PDF and should be hitting the bookstores at the beginning of May. This is an excellent book and may claim the top spot in my personal list of the best Ruby books available.
Coverity has developed a suite of static code analysis tools for C and C++. They're currently working under a contract with the Department of Homeland Security to analyze the code bases of a number of important open-source tools. Members of the projects Coverity is working with have had good things to say about the process. And many projects are showing substantial improvement.
Ruby is a recent addition to Coverity's list. Although it's nice to see Ruby accorded that kind of respect, the addition is good in two other ways. First, it allows us to compare the Perl, Python and Ruby code bases. This point isn't really important, but it is interesting. Second, it gives the Ruby core team some targets to watch as new releases approach.
Perl and Python have been on the list longer than Ruby has, and both are showing improvement. Their original measurements are shown below:
Lang LoC orig defects defect rate Perl 485,001 89 0.185 Python 273,980 96 0.350
The next table shows the current measurements for Perl and Python, with Ruby's first (and current) measurements added.
Lang LoC cur defects defect rate Perl 485,001 67 0.138 Python 273,980 14 0.051 Ruby 258,908 30 0.116
It's pretty cool to see that the Perl and Python communities have done a good job of correcting the errors that Coverity found in the code bases. It's also interesting to see that Ruby compares well with the original Perl and Python defect rates. And, Ruby doesn't look too bad against their current defect rates either. In fact, it compares well with a lot of other projects out there, such as emacs, 0.133; gcc, 0.253; FreeBSD, 0.396; or Linux 2.6, 0.220.
Hopefully, we'll see a decrease in our defect rate over time, like most of the other projects on Coverity's report. To this end, we have a great example to follow--AMANDA. AMANDA started out with a defect rate of approximately 1.0. It currently looks like this:
Project LoC cur defects defect rate AMANDA 88,414 0 0.000
The difference is so great that a company involved in AMANDA development wrote an article about it that said, among other things:
What happened next is truly remarkable. The Amanda development community ... quickly responded to address this situation. Within one week, Amanda developers fixed the entire list of identified bugs. As it currently stands, there are 0 outstanding bugs detected by the Coverity scan.
Canada on Rails has been a big event in the Ruby community. Billed as the first international event focused on Rails, Canada on Rails has drawn a lot of attention and a lot of people. I've tried to gather up some of the coverage here.
Some notable non-Ruby names attended Canada on Rails, including Tim Bray, who wrote:
I was far from the only Rails interested-but-inexperienced poseur, there were a lot of people there to find out what it's all about. I talked to a mostly-PHP developer from Calgary and tried to convince her that Rails ought to be able to do most of what she does, only cleaner and better. On the other hand, I spent one session sitting next to a guy who has a Rails shop in New York, and was hip to the very latest YARV gossip. Mostly young, unsurprising; mostly male, sigh.
Ryan Davis kept collective notes using SubEthaEdit. Day 1 notes can be read here. My favorite comment was: "Eclipse: . . . Gateway drug for Java users."
Amy Hoy teased us with an initial post. Hopefully, more is coming soon. Alex Combas also provided excellent coverage on his blog.
Several of the speakers have posts up as well:
Robby Russell talked about his new acts_as_legacy project. He also blogged about Day 1 of the conference here and here.
Jason Voorhis talked about internationalization and posted his slides in PDF.
David Astels blogged about being interviewed and casually mentioned that a DVD of the conference will be available--I wonder if it will be available to non-attendees. He also discussed his talk on Behavior Driven Design
Thomas Fuchs let us know that he was en route to the conference. Hopefully, he'll have a retrospective post up soon.
One of the rules I find myself being more and more concerned with following is "Make it right, then make it fast". The more I work with dynamic languages such as Ruby, the easier it becomes to follow this rule and the bigger the payoff becomes for doing so. In that mind, I'd like to discuss some fundamentals for optimizing Ruby code.
Any time you optimize, you need to follow some simple steps:
Get the code working. You don't want to optimize broken code.
Profile your code. Know where the bottlenecks are so you can optimize the right parts.
Benchmark your code and the alternatives. Don't replace something unless it's worth it.
If you need to, go to another language for speed. This is your last resort.
I'm not going to spend any time here talking about the first step. Hopefully, you've already got a handle on it. If not, refer to my last two articles, found here and here. They both talk about Test First programming and related topics.
Moving on to the second step, the Ruby profiler is easy to use, but it runs much more slowly than Ruby itself. To profile a program, simply do:
$ ruby -rprofile yourprog
This command produces a report that looks something like this trimmed version:
% cumulative self self total time seconds seconds calls ms/call ms/call name 15.00 0.15 0.15 45 3.33 65.33 Kernel.require 14.00 0.29 0.14 532 0.26 0.45 Gem::Specification#copy_ 6.00 0.35 0.06 438 0.14 0.16 Kernel.dup 6.00 0.41 0.06 74 0.81 15.27 Array#each 4.00 0.45 0.04 226 0.18 0.27 String#gsub! 3.00 0.48 0.03 82 0.37 0.37 String#gsub
The meaning of each column is as follows:
% time: the percentage of total time spent in this method.
cumulative seconds: the total number of running seconds in this and all previous methods.
self seconds: the number of seconds spent in this method.
calls: the number of times this method was called.
self ms/call: the time spent in this method per call.
total ms/call: the total time spent in this method or in methods it calls.
name: the name of the method.
As you profile code, you will see a lot of methods that you can't do much about, such as Kernel.dup. You'll also see some that are more fruitful for you to pursue.
Benchmarking different options is at the heart of optimizing. Fortunately, it's easy to do and the output is easy to read. Here's a quick example that benchmarks different kinds of iterators and looping in Ruby:
require 'benchmark'
n = 10_000_000
Benchmark.bm(15) do |x|
x.report("for loop:") { for i in 1..n; a = "1"; end }
x.report("times:") { n.times do ; a = "1"; end }
x.report("upto:") { 1.upto(n) do ; a = "1"; end }
end
Running this code generates a report like this one:
user system total real
for loop: 3.060000 0.000000 3.060000 ( 3.137070)
times: 3.290000 0.000000 3.290000 ( 3.308736)
upto: 3.370000 0.000000 3.370000 ( 3.372559)
This report shows that if speed matters, you probably want to use a for loop, although it won't make a huge difference. Choosing the right algorithm for the right method usually is where you get your biggest win, so spend your time on profiling and benchmarking.
If you absolutely have to go to another language, Ruby has a very clean interface for writing and using C extensions. But even it probably is too much work when you could use RubyInLine instead. RubyInline allows you to write C code within your Ruby program. This code is compiled and linked to your program, potentially representing a huge speed increase. Ryan's documentation shows a 4x speed up between:
def factorial(n)
f = 1
n.downto(2) { |x| f *= x }
f
end
and
inline do |builder|
builder.c "
long factorial_c(int max) {
int i=max, result=1;
while (i >= 2) { result *= i--; }
return result;
}"
end
If you've gotten all you can out of choosing your algorithms well, this might be your last, best hope.
--
-pate
http://on-ruby.blogspot.com
Subscribe now!
Breaking News
| AMD Calls Out Intel...We Think. | 2 days 16 hours ago |
| Bye-Bye TorrentSpy, So Long MPAA's Money | 2 days 18 hours ago |
| Sun Finds the Keys to Unlock MySQL | 4 days 12 hours ago |
| New Powers on the Throne – or Heads on the Block – at OLPC | 5 days 11 hours ago |
Featured Video
Linux Journal Gadget Guy, Shawn Powers, takes us through installing Ubuntu on a machine running Windows with the Wubi installer.
Live From the Field
The latest posts from the Linux Journal team.
Delicious
Digg
Reddit
Newsvine
Technorati






Performance Unit Tests
On April 21st, 2006 Sean Carley (not verified) says:
You could make performance into a unit test using bm. Then as you optimize, you could tell when you had done enough. Also, if you added something later that "broke" performance, you would know immediately.
Too subjective and too prone to error.
On April 22nd, 2006 zenspider (not verified) says:
Too subjective and too prone to error.
What I do is I actually run my profile runs against my unit tests. Assuming I have good coverage then the results aren't a total farce and any optimizations I do directly affect my feedback loop.
Too subjective and too prone
On April 22nd, 2006 zenspider (not verified) says:
Too subjective and too prone to error.
What I do is I actually run my profile runs against my unit tests. Assuming I have good coverage then the results aren't a total farce and any optimizations I do directly affect my feedback loop.
Good Idea
On April 21st, 2006 pate (not verified) says:
You'd want to keep performance tests separate from your regular unit tests though (or build a tricky way to track the base performance on a given system). There are so many things that can affect the numbers outside the script. Hardware, OS, potentially even the version of Ruby (what if I run it on a YARV enabled build?).
That sounds like a pretty interesting tool to build though.
Optimising helped by, (unit), tests
On April 19th, 2006 paddy3118 (not verified) says:
Good tests developed with the working program pay dividends when optimising as they can quickly show when an optimisation change makes the program fail.
The other four bulleted points were great (and apply to other languages too).
- Pad.
Good Point
On April 19th, 2006 pate (not verified) says:
This is true. Any time you want to make changes to a program without changing it's functional behavior, Unit Tests are the way to go.
I've talked about unit testing before (both in this column and elsewhere). I'll be talking about it again in the context of refactoring soon.
-pate
Coverity false positives
On April 19th, 2006 Jack Diederich (not verified) says:
Coverity errs on the side of caution. If it isn't sure that a pointer can't be NULL before use it will mark it as problematic. A developer who _is_ sure it can't be NULL will then mark the report as NOT_A_BUG. False positives are one reason why the perl & python counts dropped dramatically after the initial results. Some real bugs were fixed too, of course.