Achieving Continuous Integration with Drupal

In the early 1990s, my first job out of college was as a software engineer at a startup company. We were building a commercial product using a well-known open-source network security project. In those days, Agile software development practices (not to mention the World Wide Web, or even widespread public awareness of the Internet) still were in the future. My fellow engineers on that project (who had just graduated with me and to this day are the best programmers I know) and I were taught what we now call the Waterfall method. We thought we were invincible.

We had no idea what was coming. After consultation with potential customers, we wrote a Requirements document describing what the product needed to do, a Functional Specification that described how the product would look and behave, a Design document that described the technical architecture and internals of how we would build it, and even a Test Plan that described the automated tests we would build to ensure the product worked. We had a release deadline, declared by management, of "before Christmas". Good thing we were so young! We engaged in our Death March. The local Chinese delivery place got to know us well. I got home around 1am every morning for months. We finally finished and shipped version 1.0 of the product on December 18. It took me a few weeks to remember what normal humans did when they were not at work.

What Did I Learn from This Experience?

What we did wrong: basically, everything about the software engineering methodology we used was completely stupid. We shipped a working product on time, but we started with the benefit of a working open-source project. We made essentially every mistake that Agile development was invented to prevent.

What we did right: we actually implemented our Test Plan. Since the tests were automated, the build process had to be automated. It certainly added a lot of "extra work" to the project, but the payoff was huge. Before we left for the day, we would kick off the build script. When we came in the next morning, if the last line of output said PASSED, we felt confident and ready to ship. We didn't know it at the time, but we were on the path of what eventually would be called Continuous Integration (CI).

Fast-forward 20 years. I'm now at Acquia, which produces commercial products for companies using the open-source project Drupal. Drupal is a LAMP-stack application for building Web sites and services. We realized early on that everyone using Drupal needs to host it somewhere, and that most people building sites with Drupal do not also want to have to become experts in building a reliable, scalable infrastructure for hosting it. More than that, they also want to be able to follow best practices in software development, testing and deployment; they want to use Continuous Integration. However, they often do not have the time, resources or management support to invest in the necessary infrastructure. I've spent the last three years addressing that problem.

What Is Continuous Integration?

Many excellent and persuasive resources on the Web talk about the principles of CI in detail. In this article, I discuss a simplified list of the most meaningful best practices for Drupal Web site development:

  1. Use a source code repository. This is step zero for good software development. Most people are doing this, using Git, SVN or other systems; if you are not, start now.

  2. Make small, frequent changes. All developers should commit their changes frequently. This reduces the inevitable conflicts and lets problems surface sooner. Also, small, frequent changes enable small, frequent releases, making all the rest of the principles more valuable.

  3. Automate testing. Have your repository automatically integrated with a testing environment, so that every commit triggers a test run. This way, you know immediately if something broke.

  4. Test in a clone of the production environment. It does no good to test your software under different conditions from those that it will run in production; doing so is a recipe for taking down your site when you deploy. Never hear someone say "But it worked on my machine!" again.

  5. Make all versions easily accessible. Despite best efforts, production releases still will break, so you need an easy way to re-deploy a prior version. Then, you'll want to compare the working and broken versions to figure out what went wrong. To do this, you'll need a reference copy of past releases.

  6. Have an audit trail (that is, a blame list). This helps you not just in the source control of who made this commit, but who deployed the commit as well. This can provide rationale as well as potential fixes.

  7. Automate site deployment. In order to tolerate small, frequent releases, pushing a release needs to be an automated process so it's very quick and easy. If it's a big chore to push one release, the whole process falls apart.

  8. Measure results and iterate rapidly. Are the changes helping? Is the site faster? Did the usability enhancement yield more sales? If it's not, you can iterate again.

Achieving Continuous Integration requires some amount of infrastructure, the culture and discipline of the engineering team to use it, and management's understanding and commitment so that it supports the necessary investment. This is an article about technology, not management and culture, so I focus primarily on the infrastructure here.

Building It Yourself

Many shops build their own CI systems that are perfectly tailored to their own needs. Doing so is perfectly reasonable if you have the time and resources to get there. The biggest danger of doing it yourself, of course, is deciding to—and then not getting around to it. You end up doing things the manual, slow and error-prone way "until we have time to fix it", which often turns out to be "never". When you do get started, it probably will end up being a permanent side project, which may lead you to cut corners that will end up causing problems at the worst possible time later.

Here are some of the things you should keep in mind.

Use a source code repository. You probably already are (right?). You will need to be familiar with its "post-commit hook" capability to script actions based on it. If you are using a hosted repository (such as GitHub), you will have to integrate with its Web-based hooks.

Make small, frequent changes. All of your developers will be making frequent commits, resolving conflicts locally as best they can. To keep things moving forward, you need to have a constantly available running copy of everyone's latest code. One way to do this is to deploy the tip of your main development branch automatically to a shared development environment, so everyone always can see it. You can script this yourself using your repo's post-commit hooks. A build automation tool like Jenkins will help, but you still need to write the deployment script yourself.

Automate testing. Assuming you write automated tests for your site, you will want to run them every time someone makes what they believe is a release-ready commit. Lots of tools exist for doing this. One popular choice is Jenkins (formerly called Hudson), and it is excellent. It can integrate directly with your code repository and trigger a "job" on every commit, or run a job on a schedule.

The tests themselves are not the whole story though. Because your application is a Drupal site, you need to test it in a Web environment. You'll certainly need a running database server. If you want to test actual page loads like a browser would see, you'll need a running Web server too. You probably want to test your application along with a reasonably current production database; if you don't automate that, one day you'll find yourself testing against year-old data. However, you also probably want to "scrub" your current production database before running tests against it, lest you accidentally spam all your customers from your test servers, or worse. This is all the responsibility of your test harness script, run by Jenkins.

If you fool yourself that you can "mock out" these dependencies and have purely standalone unit tests that can run anywhere, reality will mock you back. You will discover that tests are not accurately simulating your live environment, and you will have to roll back a release that "passed all of its tests" but failed in production.

Test in a clone of the production environment. This is where things really get interesting. I've already talked about needing a running Web and database server. If your site uses additional services like memcached, Varnish or Apache Solr, you need to make sure those are in place too. If your production site uses SSL, you either need SSL running in your testing environment, or you need to turn off the checks or redirection that enforces it. Ultimately, it is as much work to maintain your test environment as it is your production environment.

______________________

Barry Jaspan is a serial software engineer and entrepreneur who has been creating and selling open-source software products literally since he was 12 years old (many moons ago!).

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks

Kadrush Shijaku tipps's picture

I am really impressed from your immaginative thinking and impressive stated information, i am glad.

Regarding Post

Peoples Plumber's picture

For example, if I actually have a branch known as "develop" hand-picked within the Dev atmosphere and that i push changes to that, the changes square measure mechanically deployed to the Dev servers. This push won't trigger the post-code-deploy hook, though. The hook is barely triggered if I switch to a special branch or tag. http://www.peoplesplumber.co.uk/plumber-lslington.html

I really enjoyed the quality

Truck Games's picture

I really enjoyed the quality information you offer to your visitors for this blog. I will bookmark your blog and have my friends check up here often. Truck Games

Another specialty that not

Anonymous's picture

Another specialty that not plenty of people know, is that the tremendous brand Ferrari has a range of co branded http://www.ewcoxusa.com with the Panerai collection and all of the watches in the range are available as premium Panerai watches .

To get a design and style to

Anonymous's picture

To get a design and style to survive the check out of time and be handed down from generations to generations speaks volumes of your related top quality. Swiss brand name cheap rolex watches are between the couple of objects in trend that time has not managed to erode. If something, they only get much better with time.

Thanks for sharing your vast

johnmathew's picture

Thanks for sharing your vast experience and tips for the upcoming young developers.
Drupal training in chennai

The sooner the better. . .

Joseph hamshey's picture

The sooner the better. . .

thanks

Anonymous's picture

I started to write a rebuttal from another hosting perspective and then realized I lost the battle when I saw that you could

Make all versions easily accessible. As you can see in Figure 2, you can always revert back to any specific tagged version or branch in any environment.

and although I would never use the generic word "tags" in that case what a neat GUI for this.
Thanks for this information. I'm on my way to try it out tonight.

Reply to comment | Linux Journal

SEOPressor's picture

Hi there! Do you know if they make any plugins
to protect against hackers? I'm kinda paranoid about losing
everything I've worked hard on. Any tips?

Testing tools?

brad.bulger's picture

Nice article! Any pointers - links, suggestions, etc. - about doing proper complete automated testing of Drupal sites? We need to get away from the archaic "click-around-and-try-stuff" method...

Great post and helpful

abasnad's picture

Great post and helpful information. Thank for sharing.

Jalantikus.com Download Game PC dan Android Gratis Terbaru dengan Server Lokal

What about..

Anonymous's picture

What about Backdrop. Will Acquia hosting support Backdrop?

No control over hosting

JvE's picture

The one problem I have with all this is that in Europe I find that most customers we build sites for want to either do or arrange their own hosting.
Therefore the automated deployment we use internally from dev to test cannot be reused. We pretty much have to maintain manual deployment steps for a plethora of different hosts and hosting environments.

Cloud hooks

jh3's picture

Any idea when the post-code-deploy hook will work on environments that are set to automatically deploy code from a branch when pushed to it?

For example, if I have a branch called "develop" selected in the Dev environment and I push changes to it, the changes are automatically deployed to the Dev servers. This push will not trigger the post-code-deploy hook, though. The hook is only triggered if I switch to a different branch or tag.

"Soon"

A little birdie's picture

"Soon"

Great article, Barry

rszrama's picture

Love the article, Barry. Great information and very readable. I've benefited from Acquia Cloud's infrastructure management many times over. I'm no better at infrastructure than I am at design, and I suck at design. : )

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState