Building a Multisourced Infrastructure Using OpenVPN
Have you ever needed to expand your colocated servers at more than one provider and allow applications to communicate as if they were on the same LAN, possibly over multiple sets of firewalls and layers of NAT? Or, maybe you've wanted to move from one hosting service to another to take advantage of lower pricing or better uptime but would have preferred to do it gradually instead of in a single swoop (and a weekend-long maintenance window)? Or, maybe you've considered the Amazon EC2 cloud to host part, but not all, of your infrastructure? If your answer to any of these questions is yes, what you want is essentially a multisourced infrastructure.
The Amazon EC2
The Amazon EC2 (Elastic Compute Cloud) is a Web service that allows users to provision new machines in an Amazon-hosted virtualized infrastructure in a matter of minutes, using a publicly available API. Users get full root access and can install almost any OS or application in their Amazon Machine Images. Web service APIs allow users to reboot their instances remotely and scale capacity quickly if necessary, by adding tens or even hundreds of machines. Additionally, there is no up-front hardware setup costs—Amazon charges only for the capacity you actually use; there is no minimum fee. As more applications find their way to Amazon's virtual computing environment, system administrators are looking for ways to provide secure connectivity over the public Internet between new machines in the Amazon EC2 and old machines in their regular data centers. This article describes one such technique—how to build a multisourced infrastructure based on OpenVPN.
Let's take a look at a simple distributed application, which consists of multiple services, a LAMP stack. Traditionally, you would start with Apache and MySQL on a single server. As your site grows, you would provision another server from your provider and add a second Apache instance. Later, you might want to provision yet another machine to be a dedicated database server to improve performance. This is a typical single-sourced infrastructure—all services run within a single physical environment, controlled and supported by a single provider.
In contrast, with a multisourced infrastructure, you no longer are limited to one provider or one data center. You are free to mix and match hosting plans from different providers to suit your business and architecture better, and you can use as many providers as you like. Your applications still can communicate with one another, but instead of having a physical LAN, it's now a virtual LAN that sits on top of public Internet links. You can grow your services horizontally and achieve better geographic redundancy and fault tolerance at the same time, all without significant changes in your application. If it works in a single-sourced physical LAN, it most likely will work in multisourced virtual LAN as well.
Additionally, you can leverage the strengths of a particular provider for just a subset of your services. Going back to the LAMP stack as our example, with Amazon EC2, you can provision many Apache instances in response to the current load quickly; although you might prefer to run MySQL on bare metal elsewhere instead of in an EC2 virtual machine.
Finally, this method allows you to expand your corporate infrastructure outside your current data center or allow outside services to use applications in your corporate data center. Consider a remotely hosted data-crunching cluster that you rent by the hour, which uses your corporate data warehouse system for its input. As you can see, a multisourced infrastructure is more flexible and can accommodate various scenarios and needs.
In this article, I describe a particular implementation of the multisourced infrastructure concept that we at CohesiveFT (www.cohesiveft.com) developed using OpenVPN and that has been running in our production environment since mid-summer 2007. We chose OpenVPN primarily because it uses standard OpenSSL encryption, runs on multiple operating systems and does not require kernel patching or additional modules. The latter benefit is of key importance. Many Virtual Private Server (VPS) hosting solutions currently provide great service with pricing that is often better than other forms of hosting. These providers build guest OS kernels specifically tailored for their environment and method of virtualization. As a result, you probably want to avoid rebuilding the Linux kernel on your VPS as much as possible. Not that it can't be done, but you can save some time and probably get faster technical support if you don't do it.
Among the alternatives to OpenVPN, there is Openswan, a code fork of the original FreeS/WAN Project, but it requires a kernel patch to support NAT traversal, according to its wiki (wiki.openswan.org/index.php/Openswan/Install).
The OpenVPN protocol also is firewall-friendly, as it can pass all traffic over a single UDP tunnel (the default port is 1194). That feature, coupled with SSL encryption, makes this solution very difficult to attack when data packets pass through the public Internet.
OpenVPN turned out to be a great choice and offered us all the functionality we expected, except for one very important feature, fault tolerance. When you use a VPN to provide corporate network access to remote users, the solution is very simple—you deploy several OpenVPN servers and configure each server with its own network segment (for example, server 10.5.0.0 255.255.0.0 and server 10.6.0.0 255.255.0.0). In a typical scenario, the dynamic IP address assigned to a remote user will not matter much, as long as you configure firewalls, applications and services to allow both subnets.
When you build a multisourced infrastructure, however, this is not an acceptable solution, unless you want servers to change their IP addresses from time to time. To satisfy redundancy and fault-tolerance requirements, we needed an active-active pair of OpenVPN servers to share a common address space—all hosts must be able to access each other by static IP addresses at all times, no matter which OpenVPN server provides connectivity at either end of the communication. Then, if we lose one OpenVPN server, the other will provide all connectivity. And, if they are both up, both will be accepting connections from clients to share the load. This feature was not available as a part of the OpenVPN source distribution, so we developed a standalone dynamic routing dæmon to facilitate active-active load balancing. You can find its source code, along with useful links, use-case scenarios and mailing lists, at www.cohesiveft.com/multisourced-infra.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Developer Poll
- Dart: a New Web Programming Experience
- Readers' Choice Awards
- What's the tweeting protocol?
- Linux is good
1 hour 6 min ago - Reply to comment | Linux Journal
1 hour 23 min ago - Web Hosting IQ
1 hour 53 min ago - Web Hosting IQ
1 hour 54 min ago - Web Hosting IQ
1 hour 55 min ago - Reply to comment | Linux Journal
4 hours 55 min ago - play with linux? i think you mean work-around linux
13 hours 21 min ago - Where is Epistle?
13 hours 27 min ago - You forgot OwnCloud
13 hours 57 min ago - aplikasi free
17 hours 11 min ago





Comments
Cube-routed on GitHub
Dmitry has been very responsive and helpful, and has made cube-routed available for download at http://cohesiveft.com/dnld/cube-routed-0.1.tar.gz
I added it to my GitHub account here: http://github.com/aguynamedben/cube-routed Hope this helps readers of this article!
-Ben
Cube-routed on GitHub
Dmitry has been very responsive and helpful, and has made cube-routed available for download at http://cohesiveft.com/dnld/cube-routed-0.1.tar.gz
I added it to my GitHub account here: http://github.com/aguynamedben/cube-routed Hope this helps readers of this article!
-Ben
cube-routed de-open-sourced?
I followed along through half this article and it looks like cube-routed has been de-open-sourced by Dmitry/CohesiveFT! I can't find it anywhere, Dmitry's GitHub, cohesiveft.com, or anywhere. Looks like they pulled a really non-FOSS maneuver, please tell me I'm wrong. =( There should at least be a disclaimer at the beginning of the article saying it's impossible to complete.
Ben Standefer
Good Solution on Multisourced Infrastructure
Hello Dmitriy Samovskiy
Your article is the thing which I was looking for a long time. I have different service on several data center and your article help me to communicate each other in secure way. - Thanks Man!