Building a Multisourced Infrastructure Using OpenVPN
Have you ever needed to expand your colocated servers at more than one provider and allow applications to communicate as if they were on the same LAN, possibly over multiple sets of firewalls and layers of NAT? Or, maybe you've wanted to move from one hosting service to another to take advantage of lower pricing or better uptime but would have preferred to do it gradually instead of in a single swoop (and a weekend-long maintenance window)? Or, maybe you've considered the Amazon EC2 cloud to host part, but not all, of your infrastructure? If your answer to any of these questions is yes, what you want is essentially a multisourced infrastructure.
The Amazon EC2
The Amazon EC2 (Elastic Compute Cloud) is a Web service that allows users to provision new machines in an Amazon-hosted virtualized infrastructure in a matter of minutes, using a publicly available API. Users get full root access and can install almost any OS or application in their Amazon Machine Images. Web service APIs allow users to reboot their instances remotely and scale capacity quickly if necessary, by adding tens or even hundreds of machines. Additionally, there is no up-front hardware setup costs—Amazon charges only for the capacity you actually use; there is no minimum fee. As more applications find their way to Amazon's virtual computing environment, system administrators are looking for ways to provide secure connectivity over the public Internet between new machines in the Amazon EC2 and old machines in their regular data centers. This article describes one such technique—how to build a multisourced infrastructure based on OpenVPN.
Let's take a look at a simple distributed application, which consists of multiple services, a LAMP stack. Traditionally, you would start with Apache and MySQL on a single server. As your site grows, you would provision another server from your provider and add a second Apache instance. Later, you might want to provision yet another machine to be a dedicated database server to improve performance. This is a typical single-sourced infrastructure—all services run within a single physical environment, controlled and supported by a single provider.
In contrast, with a multisourced infrastructure, you no longer are limited to one provider or one data center. You are free to mix and match hosting plans from different providers to suit your business and architecture better, and you can use as many providers as you like. Your applications still can communicate with one another, but instead of having a physical LAN, it's now a virtual LAN that sits on top of public Internet links. You can grow your services horizontally and achieve better geographic redundancy and fault tolerance at the same time, all without significant changes in your application. If it works in a single-sourced physical LAN, it most likely will work in multisourced virtual LAN as well.
Additionally, you can leverage the strengths of a particular provider for just a subset of your services. Going back to the LAMP stack as our example, with Amazon EC2, you can provision many Apache instances in response to the current load quickly; although you might prefer to run MySQL on bare metal elsewhere instead of in an EC2 virtual machine.
Finally, this method allows you to expand your corporate infrastructure outside your current data center or allow outside services to use applications in your corporate data center. Consider a remotely hosted data-crunching cluster that you rent by the hour, which uses your corporate data warehouse system for its input. As you can see, a multisourced infrastructure is more flexible and can accommodate various scenarios and needs.
In this article, I describe a particular implementation of the multisourced infrastructure concept that we at CohesiveFT (www.cohesiveft.com) developed using OpenVPN and that has been running in our production environment since mid-summer 2007. We chose OpenVPN primarily because it uses standard OpenSSL encryption, runs on multiple operating systems and does not require kernel patching or additional modules. The latter benefit is of key importance. Many Virtual Private Server (VPS) hosting solutions currently provide great service with pricing that is often better than other forms of hosting. These providers build guest OS kernels specifically tailored for their environment and method of virtualization. As a result, you probably want to avoid rebuilding the Linux kernel on your VPS as much as possible. Not that it can't be done, but you can save some time and probably get faster technical support if you don't do it.
Among the alternatives to OpenVPN, there is Openswan, a code fork of the original FreeS/WAN Project, but it requires a kernel patch to support NAT traversal, according to its wiki (wiki.openswan.org/index.php/Openswan/Install).
The OpenVPN protocol also is firewall-friendly, as it can pass all traffic over a single UDP tunnel (the default port is 1194). That feature, coupled with SSL encryption, makes this solution very difficult to attack when data packets pass through the public Internet.
OpenVPN turned out to be a great choice and offered us all the functionality we expected, except for one very important feature, fault tolerance. When you use a VPN to provide corporate network access to remote users, the solution is very simple—you deploy several OpenVPN servers and configure each server with its own network segment (for example, server 10.5.0.0 255.255.0.0 and server 10.6.0.0 255.255.0.0). In a typical scenario, the dynamic IP address assigned to a remote user will not matter much, as long as you configure firewalls, applications and services to allow both subnets.
When you build a multisourced infrastructure, however, this is not an acceptable solution, unless you want servers to change their IP addresses from time to time. To satisfy redundancy and fault-tolerance requirements, we needed an active-active pair of OpenVPN servers to share a common address space—all hosts must be able to access each other by static IP addresses at all times, no matter which OpenVPN server provides connectivity at either end of the communication. Then, if we lose one OpenVPN server, the other will provide all connectivity. And, if they are both up, both will be accepting connections from clients to share the load. This feature was not available as a part of the OpenVPN source distribution, so we developed a standalone dynamic routing dæmon to facilitate active-active load balancing. You can find its source code, along with useful links, use-case scenarios and mailing lists, at www.cohesiveft.com/multisourced-infra.