HTTP in 44k with libhttp
Working with web servers is something that most of us will find ourselves doing much more in the future. HTTP is becoming more and more needed, even for non-web embedded devices, since it's often the best way to re-flash a device over the Net. Port 80, the standard HTTP port, is more often than not left open on firewalls. Because of this I've often found it safe to use HTTP to transfer data through firewalls. Port 80 is usually our friend.
But, inside a device, we often find we don't have the resources or the power in some cases to support the use of a large, slow HTTP library. Many of the libraries available have some nice bells and whistles in the way of features and function. The curl library, for instance, has support for secure HTTP transfers, and that definitely is needed in many situations. However, the curl library also has a lot of other features that are not needed by every embedded device.
A library that is a bit more lean, but has fewer features, is the GNOME HTTP library. However, the library seems to make copies of the data for transfer and storage. It requires that you initialize and maintain the request, and the documentation is very sparse. Even with these limitations, the library does work and is fairly easy to use.
I've run into a low-resource situation when re-flashing a device over the Net. The images needed are often 8MB or even 16MB these days, and the available RAM places constraints on the way we download and store that data before writing it to Flash. In one case, I really needed a small footprint, and I ended up writing some socket code to perform such a task. Sadly, the code was contained within a proprietary piece of code I wrote while working for another company.
With this experience in mind, I set out to create a small compact library that everyone could use, and I decided to do it by starting out with a program called httpget I found on the Net some time back. I converted it to a shared library, called libhttp.
By rolling our own HTTP transfer, we can bring down the requirements quite a bit, and this code will be useful to many people who need only the basic features. Let's look at the sizes of these shared libraries so we can see where some real value can be found in hard numbers. It might be good to note that this is the non-SSL version of the curl library. These have been compiled on the Intel x86 architecture. A RISC processor will often produce larger binaries than on the x86:
322521 Sep 21 19:03 libcurl.so.2.0.1 110479 Sep 21 19:36 libghttp.so.1.0.0 45508 Oct 23 01:30 libhttp.so.1.0.0
Size is by far the biggest (or smallest as it may be) reason to use libhttp to begin with. Size will vary quite a bit between processors, as well as numbers changing whether one links statically or dynamically. Libhttp is small enough to fit in almost any embedded device that has such a need.
Libhttp should compile on almost any platform that supports gcc. It should be able to be called by any C or C++ application to perform the transfer and pass back an allocated pointer to the caller. The burden is on the caller to free the memory that has been allocated. This is dangerous in the event that the memory is not freed up, as this will create large memory leaks quickly, so let this be the first word of advice I can offer with this library. The caller is required to free memory.
For this article I will not cover all of the methods of HTTP, but will focus on the three common methods currently implemented in libhttp. Those methods are GET, POST and HEAD.
The GET method, the most common one, is limited to a request size of 8k, or 8,192 bytes. The POST method is not limited in size for the request but requires that you use the Content-Length header as a part of your request. This header informs the server how large the request is. In the case of the GET method, the server should end up truncating the request if the size goes over the 8k limit. Most times this will result in an HTTP error.
Another difference between the GET and POST methods is that a GET request should result in the same response from the server, even when called in succession. The POST may or may not produce the same results, and often a web server will take multiple requests and have some type of data handler sort things out for the response. The POST method is often used with forms.
For most requests, the GET method works well. If you do need to pass a large request, you'll be glad that the POST method is available. The HEAD method is like GET except that it gets only the header information, not the content. It can be used for checking the date on which some resource was last modified without actually getting it.
For the purpose of this article I am using these methods as a means to transfer data only. The data can be anything from a binary program, MP3 audio, MPEG video to a PNG image. As I've mentioned previously, Port 80 seems be our friend in this regard, and few places will block streaming HTTP, even if they do require a proxy.
I found some source code when I was looking at finding a solution that I could use in my embedded device. A program I stumbled across is called httpget, and it was written in 1994 by Sami Tikka. What's interesting is that this source code was written about the same time the browser was being introduced to the masses, not long after Linux was first created.
For the most part, this code should configure, build and install on almost every UNIX/Linux platform. That seems to be the case for all of the systems I have to test it on. I use a Debian woody system for development, currently with a 2.4.9 kernel. I am using gcc 2.95.4 natively, and gcc 2.95.2 to cross compile to a PowerPC 823 chip. I know there is a more recent cross compiler available, but this compiler has worked well and will most likely work until I can get around to upgrading.
To configure libhttp for running on Linux x86 natively, we can run the following commands:
./configure make make install
To cross compile to a specific target, configure for a host, such as a PowerPC as I do, one could configure and build with the following:
./configure --host=powerpc-linux make clean makeIf you are cross compiling, do not install libhttp as the binaries probably will not run on your development host. Instead, copy the library to your target.
The original httpget, on which libhttp is based, just outputs the data to stdout. I modified it so that it will allocate a chunk of memory and then store the data to that memory. Currently, it allocates memory for the entire size of the data to transfer, dynamically reallocating as it reads the data. This is the simplest type of interface to call since it doesn't require that the caller allocate the memory beforehand.
The following defines are the size of the buffer to read data from the socket, and the length of the transfer buffer into which store those reads. I have placed these in header defines to make it easy to change the size on them. Depending on the type of data and requirements placed on the library, these values could need changing:
#define BUFLEN 8192 #define XFERLEN 65536
When a 64k chunk of memory is allocated, then libhttp will read from the socket in 8k chunks and dynamically reallocate additional 64k chunks as needed. This will provide eight reads for each additional reallocation of memory. I have done quite a bit of testing with this, but you can change the values to suit yourself. Most web pages will fit in the first 64k chunk allocated.
If you look at the source code for hget.c, you will see that it is very simple to call http_request(). You pass it the HTTP URL, and it will connect to the server, and the response is returned in HTTP_Response.pData. Along with the URL, you pass additional entities to be placed in the request header, and enum for the HTTP method type.
Trending Topics
| You Need A Budget | Feb 10, 2012 |
| The Linux powered LAN Gaming House | Feb 08, 2012 |
| Creating a vDSO: the Colonel's Other Chicken | Feb 06, 2012 |
| Your CMS Is Not Your Web Site | Feb 01, 2012 |
| Casper, the Friendly (and Persistent) Ghost | Jan 31, 2012 |
| Razor-qt 0.4 - Qt based Desktop Environment | Jan 30, 2012 |
- Fun with ethtool
- Parallel Programming with NVIDIA CUDA
- 100% disappointed with the decision to go all digital.
- Readers' Choice Awards 2011
- Linux-Based X Terminals with XDMCP
- Validate an E-Mail Address with PHP, the Right Way
- You Need A Budget
- The Linux powered LAN Gaming House
- Why Python?
- Python for Android
- Sure the best distro is
1 hour 8 min ago - BeOS was the best
3 hours 51 min ago - I use Wireshark on a daily
8 hours 22 min ago - buena información
13 hours 28 min ago - One important "bucket" that I didn't note (désolé si qqun deja d
14 hours 29 min ago - Gnome3 is such a POS. No one
23 hours 56 min ago - Gnome 3 is the biggest POS
1 day 7 min ago - I didn't knew this thing by
1 day 6 hours ago - Author's reply
1 day 9 hours ago - Link to modlys
1 day 10 hours ago





Comments
about http library
Salamo Alikom
is there any documentation out there ,more examples .
i was looking for a library to show http headers ,i found one libhttp but it is for c++ programmers .