Large-Scale Web Site Infrastructure and Drupal
Static variable caching is a quick-and-easy win in PHP. Here is an example of a simple function with a simple query to the database:
function taxonomy_get_term($tid) {
return db_fetch_object(
db_query('SELECT * FROM {term_data} WHERE tid = %d', $tid)
);
}
This function can be given a simple static variable, so that if this function happens to be called more than once on a page load, it can skip over the call to the database and serve the result out of this static cache:
function taxonomy_get_term($tid) {
static $terms = array();
if (!isset($terms[$tid])) {
$terms[$tid] = db_fetch_object(
db_query('SELECT * from {term_data} WHERE tid = %d', $tid)
);
}
return $terms[$tid];
}
Drupal is a content management framework that Lullabot uses to build high-performance Web sites on top of this infrastructure. Drupal is built with PHP as its primary programming language and has a ton of user-contributed modules freely available to extend its functionality. It has been compared to LEGOs because of this, and because the quality of modules vary, it's a good idea to do full code reviews of any modules that are selected for inclusion into any platform build. If an existing module already does mostly what is needed, it should be reviewed to make sure static variable caching is utilized, queries are optimized and general coding standards are being used.
Regularly contribute patches back to modules when a module is found lacking in any of these areas or if any general bugs are found through the module's issue queue, which can be found on the same page where you download the module. Performance reviews also are a good idea once a site is built to ensure that queries are optimized and not run more than once per page load. The Devel module is a great resource for this, as it will give you stats on page load times, memory usage and can display every query executed on any given page load.
Beyond the regular LAMP configuration optimizations, caching techniques, and hardware infrastructure are some general Web development best practices available within Drupal that not only can reduce loads on various servers, but also make it easy to have some of your data structures in code that can be version-controlled to keep track of changes and to help with the deployment process of said changes. The first, and relatively new, paradigm of “exportables” is twofold, in that it gives you a way to read a data structure from code instead of the database, and it also can be deployed to different environments and reused.
Exportables started with the Views module by Earl (merlinofchaos) Miles who wanted a way to help debug the problems that his module users might encounter. So, he created a way for users to export the view they created into a readable data structure that he then could put on his own machine to help him debug. This not only had the awesome side effect of being able to share these “view recipes” with other users, but it also evolved into a method where the structure could replace what was read from the database and help increase the performance. Exportables then was extrapolated into a library dubbed Ctools (for Chaos Tools) and used for the Panels module. Other people started catching on and implementing exportables for their modules, and now there are a whole slew of modules that use the Ctools Exportables for this purpose.
This eventually led to a module called Features that provides a UI to choose the various exportable data structures within a Drupal installation and wrap them up into a custom “feature” module, which then can be shared. These features can be simple configuration options or complex features requiring many other contributed modules in order to provide feature-rich enhancements for any Drupal Web site. Not only can it be used to share such features, but it also has become an important part of the deployment process in creating modern Drupal Web sites.
Another tool that has recently matured and become a necessity to any professional Drupalite is Drush. Drush stands for Drupal Shell and is a way to control your Drupal Web sites through the command line. Not only does it provide powerful commands to manipulate your Web site quickly, but other modules can provide integration with Drush as well, creating their own commands related to working with their particular module. For example: the Features module provides commands to Drush that allow you to list, update and revert any feature modules quickly that are part of a Drupal installation's codebase. The Backup and Migrate module provides integration to allow you to create SQL backups of your Web site quickly with a simple command. Some modules even provide commands to work with Drupal and Git! So, not only does Drush allow you to work with your Drupal site quickly, but you also don't have to load a huge page through Apache to do so.
And, of course, no professional Web site would be complete without revision control. Lullabot has used CVS (Concurrent Versions System), SVN (Subversion) and, most recently, made the move to Git. But no matter what you use, it's important to have a backup of your work and versioning for teams working on the same project. The merits of versioning your code are many. Working on a high-performance Web site usually takes many people, so version control becomes a necessity.
Jerad Bitner has been using Drupal since the nightmare upgrades from 4.6 to 4.7 (that's early 2005, if you're asking). He started out as a Technical Illustrator with C/S Group and worked for three years with Photoshop, Illustrator, AutoCad and Macromedia products as well as PHP. When it came time to replicate a platform across the different locations of the company, Jerad found Drupal and hasn't looked back since.
Nate Haug adds a dash of design to Lullabot. He received degrees in both a Fine Arts and Computer Science from Truman State University, creating the perfect bridge between the technical and aesthetic. Detail is his obsession, so if you know what you want, Nate will deliver your desire.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- One advantage with VMs
1 hour 19 min ago - about info
1 hour 52 min ago - info
1 hour 53 min ago - info
1 hour 54 min ago - info
1 hour 56 min ago - info
1 hour 57 min ago - abut info
1 hour 58 min ago - info
1 hour 59 min ago - info
2 hours 1 min ago - info
2 hours 2 min ago




Comments
Grammy.com Numbers
Well since everyone else is throwing their business pitches in here...
The approach described in this article essentially replicates how Lullabot (the authors of this article) scaled grammy.com to 213 million page views within a single day. Most of those over a 6 hour window during the 52nd awards show. In those same 6 hours, we registered 50,000 new user accounts. Amazingly, we couldn't even measure the full potential of the set up because our hosting providers load-testing cluster couldn't send requests fast enough to bring the site down.
Slides and configuration files of this setup were presented at DrupalCamp Colorado.
or you could simply contact
or you could simply contact an expert drupal support and Maintenance firm like Halosys technologies.
table locks
For this advice to be applicable, the table would need to be undergoing more writes than reads. How many tables are like this? Not many. Watchdog is the only one that I can think of, and if that is seeing that many writes you have bigger problems.
I instead advise changing _all_ tables to InnoDB. This allows you to tune MySQL only for InnoDB, reducing the MyISAM-only buffers to near-zero (the information_schema and mysql databases still use MyISAM, so you can't completely disable it). This also reduces complexity to only be worried about one engine. The only time this does not apply is when the server has limited RAM, as a well-tunned InnoDB server requires more RAM than a well-tuned MyISAM server.
Drupal can scale to millions of page views a day
There are many ways to scale Drupal.
At 2bits.com, we prefer simpler ways without added complexity both at the code level and the infrastructure level.
Here is a presentation on 3.4 million page views a day, 92 million page views a month, one server and Drupal.
Mercury
If you are looking for a high performance Drupal setup then you should also look into project Mercury from http://getpantheon.com/