Working with LWP
After typing return a second time, you should see the contents of http://www.lerner.co.il/ returned to you. Once the document has been transferred to your terminal, the connection is terminated. If you want to connect to the same server again, you may do so. However, you will have to issue a new connection and a new request.
Just as the client can send request headers before the request itself, the server can send response headers before the response. As in the case with request headers, there must be a blank line between the response headers and the body of the response.
Here are the headers I received after issuing the above GET request:
HTTP/1.1 200 OK Date: Thu, 12 Aug 1999 19:36:44 GMT Server: Apache/1.3.6 (UNIX) PHP/3.0.11 FrontPage/220.127.116.11 Rewrit/1.0a Connection: close Content-Type: text/html
The above lines are typical for a response.
The first line produces general information about the response, including an indication of what is yet to come. First, the server tells us it is capable of handling anything up to HTTP/1.1. If we ever want to send a request using HTTP/1.1, this server will allow it. After the HTTP version number comes a response code. This code can indicate a variety of possibilities, including whether everything went just fine (200), the file has moved permanently (301), the file was not found (404), or there was an error on the server side (501).
The numeric code is typically followed by a text message, which gives an indication of the meaning behind the numbers. Apache and other servers might allow us to customize the page displayed when an error occurs, but that customization does not extend to this error code, which is standard and fixed.
Following the error code comes the date on which the response was generated. This header is useful for proxies and caches, which can then store the date of a document along with its contents. The next time your browser tries to retrieve a file, it will compare the Date: header from the previous response, retrieving the new version only if the server's version is newer.
The server identifies itself in the Server: header. In this particular case, the server tells us not only that it is Apache 1.3.6 running under a form of UNIX (in this case, Linux), but also some modules that have been installed. My web-space provider has chosen to install PHP, FrontPage and Rewrit; as we have seen in previous months, mod_perl is another popular module for server-side programming, and one which advertises itself in this header.
As we have seen, an HTTP connection terminates after the server finishes sending its response. This can be extremely inefficient; consider a page of HTML that contains five IMG tags, indicating where images should be loaded. In order to download this page in its entirety, a web browser has to create six separate HTTP connections—one for the HTML and one for each of the images. To overcome this inefficiency, HTTP/1.1 allows for “persistent connections”, meaning that more than one document can be retrieved in a single HTTP transaction. This is signalled with the Connection header, which indicated it was ready to close the connection after a single transaction in the example above.
The final header in the above output is Content-type, well-known to CGI programmers. This header uses a MIME-style description to tell the browser what kind of content to expect. Should it expect HTML-formatted text (text/html)? Or a JPEG image (image/jpeg)? Or something that cannot be identified, which should be treated as binary data (application/octet-stream)? Without such a header, your browser will not know how to treat the data it receives, which is why servers often produce error messages when Content-type is missing.
HTTP/1.0 supports many methods other than GET, but the main ones are GET, HEAD, and POST. GET, as its name implies, allows us to retrieve the contents of a link. This is the most common method, and is behind most of the simple retrievals your web browser performs. HEAD is the same as GET, but quits after printing the response headers. Sending a request of
HEAD / HTTP/1.0
is a good way to test your web server and see if it is running correctly.
POST not only names a path on the server's computer, but also sends input in name,value pairs. (GET can also submit information in name,value pairs, but it is considered less desirable in most situations.) POST is usually invoked when a user clicks on the “submit” button in an HTML form.
Now that we have an understanding of the basics behind HTTP, let's see how we can handle requests and responses using Perl. Luckily, LWP contains objects for nearly everything we might want to do, with code tested by many people.
If we simply want to retrieve a document using HTTP, we can do so with the LWP::Simple module. Here, for instance, is a simple Perl program that retrieves the root document from my web site:
#!/usr/bin/perl --w use strict; use diagnostics; use LWP::Simple; # Get the contents my $content = get "http://www.lerner.co.il/"; # Print the contents print $content, "\n";
In this particular case, the startup and diagnostics code is longer than the program. Importing LWP::Simple into our program automatically brings the get function with it, which takes a URL, retrieves its contents with GET, and returns the body of the response. In this example, we print that output to the screen.
Once the document's contents are stored in $content, we can treat it as a normal Perl scalar, albeit one containing a fair amount of text. At this point, we could search for interesting text, perform search-and-replace operations on $content, remove any parts we find offensive, or even translate parts into Pig Latin. As an example, the following variation of this simple program turns the contents around, reversing every line so that the final line becomes the first line and vice versa; and every character on every line so that the final character becomes the first and vice versa:
#!/usr/bin/perl -w use strict; use diagnostics; use LWP::Simple; # Get the contents my $content = get "http://www.lerner.co.il/"; # Print the contents print scalar reverse $content, "\n";
Note how we must put reverse in scalar context in order for it to do its job. Since print takes a list of arguments, we force scalar context with the scalar keyword.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
- Roll your own dynamic dns
3 hours 40 min ago
- Please correct the URL for Salt Stack's web site
6 hours 52 min ago
- Android is Linux -- why no better inter-operation
9 hours 7 min ago
- Connecting Android device to desktop Linux via USB
9 hours 36 min ago
- Find new cell phone and tablet pc
10 hours 34 min ago
12 hours 3 min ago
- Automatically updating Guest Additions
13 hours 11 min ago
- I like your topic on android
13 hours 58 min ago
- This is the easiest tutorial
20 hours 33 min ago
- Ahh, the Koolaid.
1 day 2 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?