Building a Multisourced Infrastructure Using OpenVPN

How to use OpenVPN to take your hosting to the next level.

Have you ever needed to expand your colocated servers at more than one provider and allow applications to communicate as if they were on the same LAN, possibly over multiple sets of firewalls and layers of NAT? Or, maybe you've wanted to move from one hosting service to another to take advantage of lower pricing or better uptime but would have preferred to do it gradually instead of in a single swoop (and a weekend-long maintenance window)? Or, maybe you've considered the Amazon EC2 cloud to host part, but not all, of your infrastructure? If your answer to any of these questions is yes, what you want is essentially a multisourced infrastructure.

Let's take a look at a simple distributed application, which consists of multiple services, a LAMP stack. Traditionally, you would start with Apache and MySQL on a single server. As your site grows, you would provision another server from your provider and add a second Apache instance. Later, you might want to provision yet another machine to be a dedicated database server to improve performance. This is a typical single-sourced infrastructure—all services run within a single physical environment, controlled and supported by a single provider.

In contrast, with a multisourced infrastructure, you no longer are limited to one provider or one data center. You are free to mix and match hosting plans from different providers to suit your business and architecture better, and you can use as many providers as you like. Your applications still can communicate with one another, but instead of having a physical LAN, it's now a virtual LAN that sits on top of public Internet links. You can grow your services horizontally and achieve better geographic redundancy and fault tolerance at the same time, all without significant changes in your application. If it works in a single-sourced physical LAN, it most likely will work in multisourced virtual LAN as well.

Additionally, you can leverage the strengths of a particular provider for just a subset of your services. Going back to the LAMP stack as our example, with Amazon EC2, you can provision many Apache instances in response to the current load quickly; although you might prefer to run MySQL on bare metal elsewhere instead of in an EC2 virtual machine.

Finally, this method allows you to expand your corporate infrastructure outside your current data center or allow outside services to use applications in your corporate data center. Consider a remotely hosted data-crunching cluster that you rent by the hour, which uses your corporate data warehouse system for its input. As you can see, a multisourced infrastructure is more flexible and can accommodate various scenarios and needs.

Figure 1. Multisourced Infrastructure: OpenVPN Virtual Links

In this article, I describe a particular implementation of the multisourced infrastructure concept that we at CohesiveFT (www.cohesiveft.com) developed using OpenVPN and that has been running in our production environment since mid-summer 2007. We chose OpenVPN primarily because it uses standard OpenSSL encryption, runs on multiple operating systems and does not require kernel patching or additional modules. The latter benefit is of key importance. Many Virtual Private Server (VPS) hosting solutions currently provide great service with pricing that is often better than other forms of hosting. These providers build guest OS kernels specifically tailored for their environment and method of virtualization. As a result, you probably want to avoid rebuilding the Linux kernel on your VPS as much as possible. Not that it can't be done, but you can save some time and probably get faster technical support if you don't do it.

Among the alternatives to OpenVPN, there is Openswan, a code fork of the original FreeS/WAN Project, but it requires a kernel patch to support NAT traversal, according to its wiki (wiki.openswan.org/index.php/Openswan/Install).

The OpenVPN protocol also is firewall-friendly, as it can pass all traffic over a single UDP tunnel (the default port is 1194). That feature, coupled with SSL encryption, makes this solution very difficult to attack when data packets pass through the public Internet.

OpenVPN turned out to be a great choice and offered us all the functionality we expected, except for one very important feature, fault tolerance. When you use a VPN to provide corporate network access to remote users, the solution is very simple—you deploy several OpenVPN servers and configure each server with its own network segment (for example, server 10.5.0.0 255.255.0.0 and server 10.6.0.0 255.255.0.0). In a typical scenario, the dynamic IP address assigned to a remote user will not matter much, as long as you configure firewalls, applications and services to allow both subnets.

When you build a multisourced infrastructure, however, this is not an acceptable solution, unless you want servers to change their IP addresses from time to time. To satisfy redundancy and fault-tolerance requirements, we needed an active-active pair of OpenVPN servers to share a common address space—all hosts must be able to access each other by static IP addresses at all times, no matter which OpenVPN server provides connectivity at either end of the communication. Then, if we lose one OpenVPN server, the other will provide all connectivity. And, if they are both up, both will be accepting connections from clients to share the load. This feature was not available as a part of the OpenVPN source distribution, so we developed a standalone dynamic routing dæmon to facilitate active-active load balancing. You can find its source code, along with useful links, use-case scenarios and mailing lists, at www.cohesiveft.com/multisourced-infra.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Cube-routed on GitHub

Ben Standefer's picture

Dmitry has been very responsive and helpful, and has made cube-routed available for download at http://cohesiveft.com/dnld/cube-routed-0.1.tar.gz

I added it to my GitHub account here: http://github.com/aguynamedben/cube-routed Hope this helps readers of this article!

-Ben

Cube-routed on GitHub

Ben Standefer's picture

Dmitry has been very responsive and helpful, and has made cube-routed available for download at http://cohesiveft.com/dnld/cube-routed-0.1.tar.gz

I added it to my GitHub account here: http://github.com/aguynamedben/cube-routed Hope this helps readers of this article!

-Ben

cube-routed de-open-sourced?

Ben Standefer's picture

I followed along through half this article and it looks like cube-routed has been de-open-sourced by Dmitry/CohesiveFT! I can't find it anywhere, Dmitry's GitHub, cohesiveft.com, or anywhere. Looks like they pulled a really non-FOSS maneuver, please tell me I'm wrong. =( There should at least be a disclaimer at the beginning of the article saying it's impossible to complete.

Ben Standefer

Good Solution on Multisourced Infrastructure

Ahamed Bauani's picture

Hello Dmitriy Samovskiy

Your article is the thing which I was looking for a long time. I have different service on several data center and your article help me to communicate each other in secure way. - Thanks Man!

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState