Multitenant Sites

For some time now, there has been tremendous growth in the world of Web applications. It's quite amazing to see what you can do just via a Web browser—not only can you buy just about anything, but also a growing number of sites offer "software as a service", often abbreviated as SaaS. The idea is that in exchange for a monthly service fee, you get access to a service. Many thousands of such services exist that take care of anything from Git repositories (for example, GitHub and BitBucket), e-mail services (for example, AWeber and MailChimp), invoicing systems, time-tracking systems, calendar systems, e-commerce systems, e-learning systems—you name it.

As Web developers, you can create your own SaaS applications. That's right—with little more than a Linux box, a database, a programming language and a Web framework, you're positioned to create a new SaaS application. With a good idea, some hard work and good marketing, you'll be on your way to having a successful business.

There are numerous models for how SaaS can work. Sometimes, you have a user name on a system, and you're simply interacting with your view of the world. But sometimes, an SaaS app gives you what appears to be an entirely new domain. So if I get an account on SuperDuperSaas.com, everything I do will be under lerner.SuperDuperSaas.com.

Programs allowing for this are known as "multitenant" applications. It's possible, of course, that each new subdomain involves the rollout of a new virtual machine. But there also are ways that you can make a single computer, with a single instance of the application, provide the same illusion of an infinite number of domains. Moreover, doing so is not nearly as difficult as you might think.

In this article, I look at several techniques that make it possible for you to create and maintain such multitenant applications. These techniques can be used in an SaaS product or any other application in which the software can and should respond differently to a variety of hostnames or domain names.

It's All Thanks to HTTP

HTTP, the Hypertext Transfer Protocol, is so ubiquitous that most people barely give it any thought. Even someone like me, who works nearly every day on Web applications, knows that HTTP exists and what it does—and yet, I don't think about it too much. However, multitenant applications owe their existence to growth in the earliest days of the Web.

The first version of HTTP that I encountered, back in 1993, was described as version 0.9. That version was a simpler protocol than the one we know today, but it already included the basic GET and POST actions—that is, you could connect to an HTTP server on port 80, and say:


GET /

The server would, if all went well, send the contents of its home page (typically formatted with HTML) back to the HTTP client. At that point, the connection would close.

Although HTTP 0.9 worked well for many simple cases, the explosive growth of the Web meant that it wasn't good enough for many complex ones. One particularly common, and particularly painful, case was that of Web hosting companies: HTTP 0.9 required that each Web site have its own IP address. If you set up a Linux-based server with a single IP address but multiple hostnames, it wasn't possible for the HTTP server to distinguish between them.

This changed when HTTP 1.0 was released and required that a "Host" header be sent along with the action and pathname. Now, a simple request looked like:


GET / HTTP/1.0
Host: lerner.co.il

The first line changed, such that it incorporated the version number of HTTP that was being used. This was done so as to have backward compatibility with HTTP 0.9 clients. The second line was defined to be the first of several "request headers", name-value pairs that could be sent from the client to the server.

These request headers have grown in scope through the years, and now include everything from the hostname to cookies to content type to caching information. But for my purposes in this article, the most important part of this request was the "Host" request header. Given that a server now could distinguish between different hosts, even on the same IP address, it was possible to have a single server provide Web hosting capabilities for any number of different domains and hostnames.

In other words, it was now possible to have the same Web server provide hosting to CompanyA.com and CompanyB.com, without either knowing of or seeing each other. The Web server would know to route requests for CompanyA.com to one directory of programs and HTML files, and CompanyB.com to a second, completely separate directory of programs and HTML files.

This might be obvious to anyone who knows about domains, hostnames and DNS, but from the perspective of the server, it didn't matter if it had to distinguish CompanyA.com from CompanyB.com, or abc.CompanyA.com from def.CompanyA.com. That is, different hostnames within the same domain were treated similarly to different domains. True, DNS and HTTP server configuration files made it easier to send *.CompanyA.com to the same location, but at the end of the day, your HTTP server sees different hostnames and, thus, can react differently.

"Virtual hosts", as they became known, shared an IP address and a computer, and so from the perspective of a programmer or IT manager, they were all under the same umbrella. From the perspective of the outside world, these were completely different Web sites. Perhaps they shared an IP address, and thus a hosting provider, but that was the only thing they had in common.

Multitenant

Today, it's trivial to service different hostnames under the same HTTP server. As I indicated previously, you simply tell Apache (or nginx, or whatever HTTP server you use) that the two hosts exist in different directories, and that they should be treated differently. With such a configuration in place, there is no connection whatsoever between the different hostnames. This actually makes it easier to move Web sites from one machine to another. You scoop up the virtual host's configuration file and move it to another machine, along with the programs and static assets—that is, HTML files and images.

Indeed, a huge industry of cheap, on-demand Web hosting perhaps has made this the most common way servers are allocated and used. Even my own personal server has five to ten different virtual hosts on it at any given time, between personal projects and demos of client applications.

A multitenant application turns this idea on its head. Rather than using a single server, with a single IP address, to service a large number of different applications, each with its own hostname, you will have many different instances of the same application. That is, you'll have both CompanyA.com and CompanyB.com point not only to the same IP address, but also to the same instance of your Web application.

This might sound strange, until you consider that because modern versions of HTTP always pass a "Host" header, and because all of the HTTP request headers are available to a Web application, you can write a single application that will work on multiple hosts. Consider that BigCompany.com has two different divisions and a separate Web site for each division. The site should be completely identical in both cases, except that the contact phone number and address should reflect the coast that the user has reached.

You can use the "Host" request header in an "if" statement inside the application, and thus display the information that is appropriate. This is a classic example of multitenant sites, although it's certainly not the most complex of them.

______________________

Reuven M. Lerner, Linux Journal Senior Columnist, a longtime Web developer, consultant and trainer, is completing his PhD in learning sciences at Northwestern University.