Protect Your Ports with a Reverse Proxy
In a previous article, I discussed Apache Tomcat, which is the ideal way to run Java applications from your server. I explained that you can run those apps from Tomcat's default 8080 port, or you can configure Tomcat to use port 80. But, what if you want to run a traditional Web server and host Java apps on port 80? The answer is to run a reverse proxy.
The only assumption I make here is that you have a Web-based application running on a port other than port 80. This can be a Tomcat app, like I discussed in my last article, or it can be any Web application that has its interface via the Web (such as Transmission, Sick Beard and so on). The other scenario I cover here is running a Web app from a second server, even if it's on port 80, but you want it to be accessed from your central Web server. (This is particularly useful if you have only one static IP to use for hosting.)
The way reverse proxying works, at least with the Apache Web server, is that every application is configured as a virtual host. Just like you can host multiple Web sites from a single server using virtual hosting, you also can host separate Web apps as virtual hosts from that same server. It's not terribly difficult to configure, but it's very useful in practice. First things first. On your server, you have the Web server installed (Figure 1). You also have a Web application on port 8080 (Figure 2). Along with the working Apache Web server, you need to make sure virtual hosting (by name) is enabled.
Figure 1. I have Apache installed, and it's hosting a very simple page. on port 80.
Figure 2. I have a Web application running on port 8080 on the server located at 192.168.1.11.
Enabling Name-Based Virtual Hosts
Enabling name-based virtual hosting on Apache is extremely common, and it's very simple to do. Depending on what distribution you're using, the "proper" location for enabling name-based virtual hosting may differ. The nice thing about Apache, however, is that generally as long as the directive is specified somewhere in the configurations, Apache will honor it.
My local test server is running Ubuntu. In order to determine where the "proper" place to enable name-based virtual hosting is, I simply went to the /etc/apache2 directory and executed:
grep NameVirtualHost *
That command searches for the
directive, and it returned this:
root@server:/etc/apache2# grep NameVirtualHost * ports.conf:NameVirtualHost *:80 ports.conf: # If you add NameVirtualHost *:443 here, # you will also have to change
Those results tell me that the
NameVirtualHost directive is specified
in the /etc/apache2/ports.conf file. (Note that grep will return
only the lines that
contain the search term, which is why it shows those two
out-of-context lines above. The important thing is the filename
ports.conf, which is what I was looking for.) Again, with Apache, it generally
doesn't matter where you specify directives, but I like to stick with
the standards of the particular distribution I'm using, if only
for the sake of future administrators.
To enable name-based virtual hosting, you simply uncomment:
from the file, and save it. If you can't find a file that contains such a directive commented out, just add the line to your apache.conf or httpd.conf file. Then you need to specify a VirtualHost directive for the virtual host you want to create. This process is the same whether you're making a traditional virtual host or a reverse proxy virtual host.
Creating a Virtual Host
As in the previous section of this article, it's important to note that the Apache configuration file layout will vary with distributions. In Ubuntu, there are two folders: sites-available and sites-enabled. The first has text files with snippets of code defining the individual virtual hosts, and the second has symbolic links to the files located in the sites-available folder. This seems complicated to be sure, but it's actually for convenience sake. You can define as many virtual hosts as you want in the sites-available folder, but until they're symbolically linked into the sites-enabled folder, they're not parsed by Apache.
Let's create a virtual host, but instead of making a traditional virtual host that defines a directory to look for files, let's define reverse proxy rules. Here is the file I created in sites-available (I explain each line next):
root@server:/etc/apache2# cat sites-available/reverseprox <VirtualHost *:80> LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_http_module modules/mod_proxy_http.so ServerName sab.mydomain.com ServerAlias sab ProxyRequests Off ProxyPass / http://192.168.1.11:8080/ ProxyPassReverse / http://192.168.1.11:8080/ </VirtualHost>
First off, if it's not clear, the name of the file I created is "reverseprox", and I created it in the /etc/apache2/sites-available folder. If you are using a different distribution, you may not have this sort of folder setup. You actually can add the VirtualHost directives directly to the apache.conf or httpd.conf file. Ubuntu just uses the folder structure for clarity and convenience.
Here's the line-by-line breakdown:
<VirtualHost *:80>— this opens the stanza, and it means "listen on all IP addresses on port 80 for anyone requesting my server name".
LoadModule proxy_module modules/mod_proxy.soand
LoadModule proxy_http_module modules/mod_proxy_http.so— these lines load two separate modules. Note that although the module names look similar, they actually are two modules: mod_proxy and mod_proxy_http. Sometimes modules are loaded globally in another configuration file. That's okay to do, but this is just a way to make sure the required modules are loaded for your virtual host. (Note: if you get an error about "file not found" during startup, you might need to make a symbolic link to your system's modules folder. On my Ubuntu system, that means
sudo ln -s /usr/lib/apache2/modules /etc/apache2/.)
ServerName sab.mydomain.com— this is the domain name the virtual host should listen for. If a request comes into Apache for "sab.mydomain.com", it knows to use this virtual host declaration to respond. Of course, "sab.mydomain.com" is a generic example; you should use your actual domain name.
ServerAlias sab— it's possible to have multiple
ServerAliasstatements, but in this case, there's only one. I've added "sab" all by itself as an alias for Apache to listen for. It will use a request for "sab" the same way it uses a request for "sab.mydomain.com"—this is simply an alias.
ProxyRequests Off— this is actually the default setting for the ProxyRequests directive. I always add it to my VirtualHost stanza anyway to make sure I'm not inadvertently allowing someone to use my server as an anonymous proxy.
ProxyRequests Onwould allow others with access to your server to use it as a proxy, effectively hiding themselves from the Internet and making you responsible for their surfing! Hopefully, it's clear why I specify "Off", even though it's the default setting.
ProxyPass / http://192.168.1.11:8080/— this tells Apache that when someone requests the root-level folder of this virtual host to "serve" them the address listed. From end users' prospectives, the alternate port, and possibly the alternate server address, will be hidden. They'll see only the URL they entered to get to the virtual host. You can have multiple ProxyPass directives if you want a specific subfolder to be directed elsewhere. Apache is very flexible with what you can specify in a reverse proxy situation.
ProxyPassReverse / http://192.168.1.11:8080/— this rule is what makes the reverse proxy work. It rewrites the response from the proxied server so that end users never see any information apart from the virtual hostname they've surfed to. Any responses from the underlying server (in this case, the server listening on port 8080) are rewritten on the fly so that it appears that the responses are coming directly from the virtual host server.
</VirtualHost>— this closes the stanza, or the section defining the virtual host. In Ubuntu, this is a single file in the sites-available folder. It also could just be something tacked onto the end of the apache.conf file in another distribution.
Making It All Work
Once you've created the virtual host declaration for the reverse proxy site, you need to reload Apache. Remember, if you're using Ubuntu, you need to create a symbolic link so that Apache reads your configuration from the sites-enabled folder. To do that, go into the sites-enabled folder, and type:
ln -s ../sites-available/reverseprox .
This will create a symbolic link from the reverseprox file you created to the sites-available folder. If you're using another distribution and just tacked that stanza to the end of the apache.conf file, you don't need to make any symbolic links.
Next, reload Apache. I actually prefer to restart Apache to make sure it loads up everything correctly, but a reload should do the trick. In Ubuntu, I do this:
sudo service apache2 restart
And, the reverse proxy should be ready to go. You just need to make sure your DNS points correctly to the server. The quickest way to do that, and make sure stuff is working, is to add a simple line to your workstation's /etc/hosts file. I added this:
192.168.1.11 sab sab.mydomain.com
And, then I saved it. Next, I opened a browser, and surfed to "sab" instead of 192.168.1.11:8080, and Figure 3 shows the results. Success!
Figure 3. Now I can access that Web application without entering any port number at all! Plus, it gets its own domain name!
The great thing about using Apache's reverse proxy technique is that you're not limited to redirecting only to the same server on a different port. You can make a reverse proxy so that google.yourdomain.com returns the actual Google search engine. You'll just create a virtual host for google.yourdomain.com, and set the ProxyPass and ProxyPassReverse directives to point to http://www.google.com/. It's truly simple. In fact, a reverse proxy on your local network might be a way to provide access to an otherwise blocked Web site for your users. What if your Web-filtering policies blocked a particular news site, but your server had access? You could create a reverse proxy on your server that your users could connect to and get to the site without being filtered by your Web filter! (Another word of caution: this is why it's important to set ProxyRequests to Off, so they don't use your reverse proxy to circumvent all Web filtering!)
With reverse proxies, it's possible to make your Web infrastructure much less confusing for your end users. It also allows you to make changes to your underlying Web apps without affecting your users at all. If a service changes IP addresses or ports, you simply can adjust your reverse proxy definitions, and end users never will know the difference. Reverse proxies are easy to configure and simple to maintain. They will help keep your URLs clean and your systems easy to manage!