Sidekiq

From my perspective, one of the best parts of being a Web developer is the instant gratification. You write some code, and within minutes, it can be used by people around the world, all accessing your server via a Web browser. The rapidity with which you can go from an idea to development to deployment to actual users benefiting from (and reacting to) your work is, in my experience, highly motivating.

Users also enjoy the speed with which new developments are deployed. In the world of Web applications, users no longer need to consider, download or install the "latest version" of a program; when they load a page into their browser, they automatically get the latest version. Indeed, users have come to expect that new features will be rolled out on a regular basis. A Web application that fails to change and improve over time quickly will lose ground in users' eyes.

Another factor that users consider is the speed with which a Web application responds to their clicks. We are increasingly spoiled by the likes of Amazon and Google, which not only have many thousands of servers at their disposal, but which also tune their applications and servers for the maximum possible response time. We measure the speed of our Web applications in milliseconds, not in seconds, and in just the past few years, we have reached the point when taking even one second to respond to a user is increasingly unacceptable.

The drive to provide ever-greater speed to users has led to a number of techniques that reduce the delays they encounter. One of the simplest is that of a delayed job. Instead of trying to do everything within the span of a single user request, put some of it aside until later.

For example, let's say you are working on a Web application that implements an address book and calendar. If a user asks to see all of his or her appointments for the coming week, you almost certainly could display them right away. But if a user asks to see all appointments during the coming year, it might take some time to retrieve that from the database, format it into HTML and then send it to the user's browser.

One solution is to break the problem into two or more parts. Rather than having the Web application render the entire response together, including the list of appointments during the coming year, you can return an HTML page without any appointment. However, that page can include a snippet of JavaScript that, after the page is loaded, sends a request to the server asking for the list. In this way, you can render the outline of the page, filling it with data as it comes in.

Sometimes, you can't divide jobs in this way. For example, let's say that when you add a new appointment to your calendar, you would like the system to send e-mail to each of the participants, indicating that they should add the meeting to their calendars. Sending e-mail doesn't take a long time, but it does require some effort on the part of the server. If you have to send e-mail to a large number of users, the wait might be intolerably long—or just annoyingly long, depending on your users and the task at hand.

Thus, for several years, developers have taken advantage of various "delayed jobs" mechanisms, making it possible to say, "Yes, I want to execute this functionality, but later on, in a separate thread or process from the handling of an HTTP request." Delaying the job in like this may well mean that it'll take longer for the work to be completed. But, no one will mind if the e-mail takes an additional 30 seconds to be sent. Users certainly will mind, by contrast, if it takes an additional 30 seconds to send an HTTP response to the users' browser. And in the world of the Web, users probably will not complain, but rather move on to another site.

This month, I explore the use of delayed jobs, taking a particular look at Sidekiq, a Ruby gem (and accompanying server) written by Mike Perham that provides this functionality using a different approach from some of its predecessors. If you're like me, you'll find that using background jobs is so natural and easy, it quickly becomes part of your day-to-day toolbox for creating Web applications—whether you're sending many e-mail messages, converting files from one format to another or producing large reports that may take time to process, background jobs are a great addition to your arsenal.

Background Queues

Before looking at Sidekiq in particular, let's consider what is necessary for background jobs to work, at least in an object-oriented language like Ruby. The basic idea is that you create a class with a single method, called "perform", that does what you want to execute. For example, you could do something like this:


class MailSender
    def perform(user)
        send_mail_to_user(user)
    end
end

Assuming that the send_mail_to_user method has been defined in your system, you can send e-mail with something like:


MailSender.new.perform(user)

But here's the thing: you won't ever actually execute that code. Indeed, you won't ever create an instance of MailSender directly. Rather, you'll invoke a class method, something like this:


MailSender.perform_async(user)

Notice the difference. Here, the class method takes the parameter that you eventually want to be passed to the "perform" method. But the "perform_async" class method instead stores the request on a queue. At some point in the future, a separate process (or thread) will review method calls that have been stored in the queue, executing them one by one, separately and without any connection to the HTTP requests.

Now, the first place you might consider queuing method classes that you'll want to execute would be in the database. Most modern Web applications use a database of some sort, and that would be a natural first thought. And indeed, in the Ruby world, there have been such gems as "delayed job" and "background job" that do indeed use the database as a queue.

The big problem with this technique, however, is that the queue doesn't need all of the functionality that a database can provide. You can get away with something smaller and lighter, without all the transactional and data-safety features. A second reason not to use the database is to split the load. If your Web application is working hard, you'll probably want to let the database be owned and used by the Web application, without distracting it with your queue.

So, it has become popular to use non-relational databases, aka NoSQL solutions, as queues for background jobs. One particularly popular choice is Redis, the super-fast, packed-with-functionality NoSQL store that works something like a souped-up memcached. The first job queue to use Redis in the Ruby world was Resque, which continues to be popular and effective.

But as applications have grown in size and scope, so too have the requirements for performance. Resque is certainly good enough for most purposes, but Sidekiq attempts to go one better. It also uses Redis as a back-end store, and it even uses the same storage format as Resque, so that you either can share a Redis instance between Resque and Sidekiq or transition from one to the other easily. But, the big difference is that Sidekiq uses threads, whereas Resque uses processes.

______________________

Reuven M. Lerner, Linux Journal Senior Columnist, a longtime Web developer, consultant and trainer, is completing his PhD in learning sciences at Northwestern University.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Woah that blog is usually

Anonymous's picture

Woah that blog is usually excellent i really like reading ones articles. Stay in the good paintings! You already know, a lot of individuals are searching around due to this info, you can help these individuals enormously. http://shortnatural-hairstyles.blogspot.ro

Page loading

Taylor's picture

Nowadays, the chance of users leaving a web page are very high if the page is loading slowly. That's why reducing the loading times in a web application is very important. Sidekiq is a must have for dividing jobs due to its high performance.

Good Perspective

Kappy's picture

Ah, there is a great perspective followed by an amazing piece of code.

Thanks for the

Lauri Nevala's picture

Thanks for the introduction!

As a side note. I would rather pass the id of the user to the worker instead of the user object. Here are more details about this issue:
https://github.com/mperham/sidekiq/wiki/Best-Practices#1-make-your-jobs-...

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix