Protecting Your Site with Access Controls
One of the wonderful things about the Web is that so much information is freely available. For the cost of a telephone call and a monthly bill from your Internet service provider, you can read hundreds of newspapers, get updates on the computer industry and listen to radio stations from your home town.
Even the most open, freely available site usually contains one or more sections that are not meant for public consumption. The reasons for cordoning off sections of the site can vary: Perhaps the webmaster wants a place to put his favorite hacks, a repository for testing new programs or a directory in which staff notices can be placed. If a site wants to charge for content or restrict access to members of an organization, the problem becomes even more obvious.
One popular way to handle these problems is to create a directory that others are unlikely to guess. But this approach, known as “security through obscurity”, only works as long as no one leaks the name of the hidden directory. A far more robust approach will restrict access based on user name,password combinations.
This month, we will look at ways in which to restrict access to your server with the Web's standard user name, password authorization scheme. The principles should apply to any web server, but I will be using the freely available Apache web server (available at http://www.apache.org/) in my examples.
Access restrictions are part of HTTP, the protocol used in most web transactions. When your browser requests a document from a server using HTTP, it is usually returned immediately, preceded by several headers (i.e., name,value pairs) describing its length, the date on which it was last modified and the type of content it contains.
HTTP's designers recognized that webmasters might want to restrict access to one or more directories. Since version 1.0, HTTP has included provisions for restricting access to parts of a web site.
Let's see how this protection works from a computer's view, first by looking at an unprotected site and then by looking at a protected one. Once we understand how access protection works, we can incorporate it into our own work.
Everything starts when a user asks the browser to retrieve a document. No matter whether the user types the URL into a text field, selects it from a list of book marks or clicks on a hyperlink in an existing page of HTML, the effect is the same. The browser takes the URL, dissects it into a protocol, a server and a document, and takes the appropriate action. In the case of a URL such as:
the protocol name is http, the server name is www.ssc.com, and the document name (really a directory) is /lj/. Most Web servers are configured such that requesting a directory is the same as requesting the file index.html within that directory, so the above URL is effectively equivalent to this one:
http://www.ssc.com/lj/index.htmlWe can simulate the browser's actions by dissecting the URL on our own and by requesting the document /lj/ from www.ssc.com using HTTP from the Linux command line. The TELNET program is generally used to log into a remote machine, most often to open a shell on that machine. By giving telnet an argument in addition to the machine name, we can specify the port to which we wish to connect. Since web servers sit on port 80 by default, we can connect to the web server on www.ssc.com by typing:
telnet www.ssc.com 80When we establish a connection to that web server, we can enter an HTTP request. These requests start with a line describing the action we wish to take (known as a “method”), the name of the document we wish to retrieve and the version of HTTP we are using. Beginning with HTTP 1.0, this initial line can be followed by one or more header lines containing information about the user's browser, document types that the browser is willing to expect, HTTP cookies that may have been set in the past and other useful bits of information. For our purposes, it is enough to enter this line:
GET /lj/ HTTP/1.0and then press enter twice—once to end the line containing the request, and a second time to indicate that we have finished sending all of the headers and that we will now wait for a response from the server.
If all goes well, the server will respond by returning a page of HTML. In this particular case, we will receive HTML-formatted text (as we can tell from the text/html Content-Type header at the top of the response) with the latest information about this very magazine. Your browser is responsible for taking the HTML returned by the server and displaying it for you.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Introduction to MapReduce with Hadoop on Linux
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?