At the Forge - Memcached
One of the watchwords for modern Web developers is scalability. Whether we're following the latest news about Twitter's servers or writing our own applications, developers always are thinking about whether their system will be scalable.
This issue has been particularly prominent during the spring and summer of 2008, as Ruby on Rails (my preferred platform for Web development) has been criticized for its use of RAM and its relatively slow execution speed. The massive server problems that Twitter experienced during the first half of 2008 were widely described as stemming from Twitter's use of Rails (despite denials from Twitter's technical team) and led to speculation that Rails cannot be used for a scalable application. One of the hosts of the weekly RailsEnvy podcast makes a point of sarcastically saying that “Rails doesn't scale” in each episode, because it was said so frequently.
There's no doubt that Rails is more resource-intensive than many other application development frameworks. This is partly due to the need for improvements in the Ruby language itself—improvements that look like they'll be available within the coming year. And, it also is true that the Rails framework uses more CPU and memory than some of its counterparts, such as Django, because of the nature of the features that it offers.
But, there's a difference, I believe, between calling Rails resource-intensive and calling it inherently unscalable. Scalability has more to do with the architecture and design of an application, allowing it to grow naturally from a single box containing both the Web and database servers to a network of servers. A Web application written in C might execute very quickly and, thus, handle a larger load on a single server, but that doesn't mean the application is inherently more scalable. At a certain point, even an efficient C program will reach its capacity, and if it isn't designed with this in mind, the more efficient application will be the less scalable one.
So, I tend to think about scalability as an architectural problem, one that ignores the specific programming language in which an application is implemented, and which is different from the issue of execution speed and efficiency. You can have highly scalable programs written in an inefficient framework, but it does take a bit more discipline and requires that programmers think carefully about the way they are writing the code. Even if you're starting on a single computer, designing the software in a scalable way allows you to distribute the load (and tasks) across a number of specialized servers.
One of the most important issues having to do with scalability actually has little or nothing to do with the Web application framework on which the program is written. Most modern Web applications use a relational database for persistent data storage, which means that the database server can be a bottleneck. Even if the database server isn't pushing its limits, the fact is that it takes time for a relational database to process a query, retrieve one or more appropriate rows and send them back to the querying application.
If your application is highly dynamic, it might use as many as a dozen SQL calls for each page, which will not only stress your database, but also significantly reduce the speed with which you can service each HTTP request. Longer request times mean that your users will be drumming their fingers longer and that your server will need more processes to handle the same number of requests.
One solution is to use multiple database servers. There are solutions for hooking together multiple servers from an open-source database (for example, PostgreSQL or MySQL), not to mention proprietary (and expensive) solutions for commercial databases, such as Oracle and MS-SQL. But, this is a tricky business, and many of the solutions involve what's known as master-slave replication, in which one database server (the master) is used for data modification, and the other (the slave) can be used for reading and retrieving information. This can help, but it isn't always the kind of solution you need.
But, there is another solution—one that is simple to understand and relatively easy to implement: memcached (pronounced “mem-cash-dee”). Memcached is an open-source, distributed storage system that acts as a hash table across a network. You can store virtually anything you like in memcached, as well as retrieve it quickly and easily. There are client libraries for numerous programming languages, so no matter what framework you enjoy using, there probably is a memcached solution for you.
This month, we take a quick look at memcached. When integrated into a Web application, it should help make that application more scalable—meaning it can handle a large number of users, spread across a large number of servers, without forcing you to rewrite large amounts of code. Version 2.1 of Ruby on Rails went so far as to integrate memcached support into the framework, making it even easier to use memcached in your applications.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
6 min 37 sec ago
- Keeping track of IP address
1 hour 57 min ago
- Roll your own dynamic dns
7 hours 11 min ago
- Please correct the URL for Salt Stack's web site
10 hours 22 min ago
- Android is Linux -- why no better inter-operation
12 hours 37 min ago
- Connecting Android device to desktop Linux via USB
13 hours 6 min ago
- Find new cell phone and tablet pc
14 hours 4 min ago
15 hours 33 min ago
- Automatically updating Guest Additions
16 hours 41 min ago
- I like your topic on android
17 hours 28 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?