Large-Scale Web Site Infrastructure and Drupal
Search is resource-intensive. Optimizing search will contribute to overall site performance and is a great process to outsource to another box. Solr can help ease an over-burdened Web server. Solr is a project from the Apache Foundation that takes the power of Lucene, a fantastic indexer and searcher, and exposes it as a Web service. Using HTTP POST and GET requests, you can feed documents to Solr for indexing and issue queries for searching. In Drupal, the Views module serves as a visual query-builder and handles search. With Views 3, in Drupal, you can plug in Solr to handle the search heavy-lifting instead of having Drupal hit MySQL for this, alleviating a load on your database server best left to a document indexer like Lucene.
Apache's MaxClients setting is a limit on the number of simultaneous requests that can be served. If this limit is reached, users have to wait until a child process is freed up until they can connect. If this number is increased too much, however, there is a risk that the Web head will run out of memory. There's a standard formula for figuring out what this setting should be based upon the RAM available to the machine:
formula: RAM/Average Apache Memory Size in Use = # max clients
example: 2GB/20MB = 100 MaxClients
Apache's mod_expires setting controls the HTTP header information for anything served through Apache to your machine. If a resource has been cached on a user's computer, this setting can tell any subsequent request to that resource if it has expired and needs to be downloaded again. It's a good idea to have this turned on for text/HTML header types:
<IfModule mod_expires.c> ExpiresActive On ExpiresDefault A1209600 ExpiresByType text/html A1 </IfModule>
The KeepAlive setting is a way to tell Apache to keep an HTTP connection alive for a period of time so that it can be reused. This has been shown to result in an almost 50% speed increase in latency times for HTML documents with many images. Turn this on and set the KeepAliveTimeout to 2 seconds:
KeepAlive On KeepAliveTimeout 2
MySQL is the most widely used database for Drupal, although Drupal 6 also supports Postgres. Drupal 7 has an object-oriented database abstraction layer that allows drivers to be written for many other database systems. There are some key things to keep in mind within MySQL's configuration that can help optimize your application for performance.
MySQL has a built-in query cache that is turned on by default. Make sure to afford a liberal amount of memory to this cache:
[mysqld] query_cache_size=32M
Once your application is built, it's a good idea to log slow queries for a short amount of time to get a list of queries that are taking a long time and can be examined with an EXPLAIN and then optimized:
log-slow-queries = /var/log/slow_query.log long_query_time = 5 #log-queries-not-using-indexes
MySQL's EXPLAIN command is a great way to find out exactly what a particular query is doing in order to get some clues as to why it may be taking a long time to evaluate and return a result. One of the key things to look at is the number of rows that EXPLAIN tells you it had to search through. This may indicate that one of your tables, bursting at the seams, is a good candidate for a new index.
Taking a look at the following query, we see there are three fields that could have an index placed upon them in order to reduce the number of rows that a query has to search through in order to find the desired result:
...
FROM node node
WHERE node.status = 1
AND node.type IN ('story')
ORDER BY node.created DESC
The status, type and created fields are key to this query's result and can be indexed so that they are seen as a group:
mysql> ALTER TABLE node ADD INDEX (status, type, created);
Table locking can be a performance headache. By default, Drupal's MySQL database tables are all set to MyISAM. Because MyISAM locks the entire table down during a query, high traffic may cause MySQL errors when a certain table is unavailable or locked. If you start seeing these errors, look at which tables are giving the error and evaluate whether they should be set to InnoDB instead. InnoDB does row locking instead of table locking. When evaluating, look to see if the table has any auto_increment fields, and keep in mind that converting this table may cause slow-downs on INSERTs, as InnoDB does a full table lock on INSERTs to avoid key duplication.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Dart: a New Web Programming Experience
- Developer Poll
- What's the tweeting protocol?
- May 2013 Issue of Linux Journal: Raspberry Pi
- Reply to comment | Linux Journal
3 hours 32 min ago - Reply to comment | Linux Journal
4 hours 19 min ago - Web Hosting IQ
5 hours 53 min ago - Thanks for taking the time to
7 hours 29 min ago - Linux is good
9 hours 27 min ago - Reply to comment | Linux Journal
9 hours 44 min ago - Web Hosting IQ
10 hours 14 min ago - Web Hosting IQ
10 hours 15 min ago - Web Hosting IQ
10 hours 16 min ago - Reply to comment | Linux Journal
13 hours 16 min ago




Comments
Grammy.com Numbers
Well since everyone else is throwing their business pitches in here...
The approach described in this article essentially replicates how Lullabot (the authors of this article) scaled grammy.com to 213 million page views within a single day. Most of those over a 6 hour window during the 52nd awards show. In those same 6 hours, we registered 50,000 new user accounts. Amazingly, we couldn't even measure the full potential of the set up because our hosting providers load-testing cluster couldn't send requests fast enough to bring the site down.
Slides and configuration files of this setup were presented at DrupalCamp Colorado.
or you could simply contact
or you could simply contact an expert drupal support and Maintenance firm like Halosys technologies.
table locks
For this advice to be applicable, the table would need to be undergoing more writes than reads. How many tables are like this? Not many. Watchdog is the only one that I can think of, and if that is seeing that many writes you have bigger problems.
I instead advise changing _all_ tables to InnoDB. This allows you to tune MySQL only for InnoDB, reducing the MyISAM-only buffers to near-zero (the information_schema and mysql databases still use MyISAM, so you can't completely disable it). This also reduces complexity to only be worried about one engine. The only time this does not apply is when the server has limited RAM, as a well-tunned InnoDB server requires more RAM than a well-tuned MyISAM server.
Drupal can scale to millions of page views a day
There are many ways to scale Drupal.
At 2bits.com, we prefer simpler ways without added complexity both at the code level and the infrastructure level.
Here is a presentation on 3.4 million page views a day, 92 million page views a month, one server and Drupal.
Mercury
If you are looking for a high performance Drupal setup then you should also look into project Mercury from http://getpantheon.com/