Web Servers and Dynamic Content

Using legacy languages like C and Fortran can aid computationally complex web applications.
How Your Program Passes Data Back to the Web Server

With the data from the web page obtained, your program can now perform all of its processing and can tell the web server what to reply. That reply may be a simple plain text message, an HTML document (the most common form), an image (typically in GIF or JPEG format) or any other complex data type. These data types are referred to as MIME types, and a standard subset is recognized worldwide on almost all browsers in use today. The mechanism by which your program passes data back to the web server (for transmission to the client browser) is by writing that data out to the thread's standard outstream, the same mechanism that is used to write characters to the screen in your favorite language. The format for this data is simple:

Content-type:[SPC][MIME-type];[CR][CR][Document-Data]

First, your program must declare the MIME type. The most common MIME types are as follows:

  • text/plain—Plain text that is output as block characters with exactly the alignment used when transmitted (no word wrap).

  • text/html—Standard HTML document text.

  • image/gif—Image encoded using one of the Compuserve GIF image specifications (it should be noted that the format uses Lempel-Ziv compression technologies which may not be in the public domain and may require the software producer or software user to license the software from the owner of the patent, which is Unisys).

  • image/jpeg—Image encoded using the JPEG image standard.

Second, your program must send a semicolon (“;”) followed by two carriage returns (“\n”).

Third, your program must prepare the body of the document you wish to transmit. It may be the content of a plain text or HTML document or the binary data that makes up the raw data block of a GIF or JPEG image.

Therefore, getting the web server to send a simple reply can be as easy as:

printf("Content-type: text/html\n\n<HTML><HEAD>
   </HEAD>"
"<BODY><H3>My Quick Test Page</H3></BODY></HTML>\n");

That's it; those are the basics for telling the web browser what to reply to your client's request. There are, of course, some cute things one can add to this basic format that lends a measure of control over how the document is rendered. One example is the addition of the “charset=” qualifier after the MIME type (right before the carriage return), which ensures that the browser will render the HTML document being transmitted using the appropriate character set (examples are “ISO-;9660-;1”, “ISO-;9660-;2”, “KOI-;8”, “WIN-;1225”, etc.). Therefore, the savvy programmer may wish to send out the document like this:

         printf("Content-type: text/html;
     charset=KOI-;8\n\n"
                "<HTML><HEAD></HEAD><BODY><H3>"
                "<BODY><H3>Maya malinkayaproba
      </H3></BODY></HTML>\n");

Pushing Continual Updates to the Browser

Every so often the purpose of a web page is to monitor some long and involved process that typically takes longer than one time-out period to complete or to generate a full update. This is another situation that can be dealt with well in legacy languages like C/C++ and Fortran. The idea is to force the web server to keep the TCP pipe open to the browser and to keep pushing new documents down to the browser at an interval specified by your program.

The formula to accomplish this, given here, is specific to the Apache web server, which as we all know, is the most popular HTTP dæmon used in the Linux world to date. If you are unsure whether this will work with your particular HTTP dæmon, try it and let me know. Here are the steps:

  1. Rename your program's binary to begin with the characters “nph-;”. This means that if the binary of your program is named “update.cgi”, then change its name to “nph-;update.cgi”.

  2. Transmit the HTTP header that the web server normally hands to the web browser (this is done for reasons that will be explained below):

printf("HTTP/1.0 200 Okay\n");
  1. Define the MIME type of the document as “multipart/x-mixed-replace”:

printf("Content-;Type:multipart/x-;mixed-;replace;"
        "boundary=SoMeRaNdOmTeXt\n");
  1. Initiate the first document transmission by passing the token declared in “boundary”:

printf("\n—SoMeRaNdOmTeXt\n");
  1. Send the next document update. This is simply a document that should be displayed until the subsequent transmission goes out along the same open connection at some point in the future. The update is followed by another instance of the token declared in “boundary”:

printf("Content-type: text/html\n\n<HTML><HEAD>
    </HEAD>"
"<BODY><H3>Update #%d</H3></BODY></HTML>\n"
"\n-SoMeRaNdOmTeXt\n", Count++);
  1. Flush the standard out buffer:

fflush(stdout);
  1. Repeat steps five and six until all updates have been transmitted. On the last update, do not transmit the token simply flush standard output and exit. This will leave the last update in the client browser's window after your program exits.

A simple example of a program that uses server-side push to count on your browser's screen from one to ten with a delay of one second between count updates is shown in Listing 1.

Listing 1. Count to Ten

In order to explain how this works, it is necessary to understand a little bit about what the server does in the background. Up until this point, your program's output was verified for validity (i.e., a proper MIME type, proper separators, etc.) and was passed on to the client browser with some additional HTTP headers pre-pended to it. In order to take more control over the web server/client browser interaction, we must ask the web server to stop performing these validity checks and to stop adding its normal headers. This is what the “nph” stands for your program's new filename No Parsed Headers. When the name of your program begins with the letters “nph-”, this means that the web server now assumes that your program is responsible for performing all of the validation checks and header transmissions that would normally be the responsibility of the web server. The web server will simply keep the TCP pipe open to the client browser and grab data as it comes out of your program's standard output stream and pushes it down that TCP pipe to the browser. We are now in a position to understand what is happening in step two; this is a required header that is normally transmitted by the web server and was completely transparent to the program hiding behind the CGI.

Next, we must tell the client browser to expect continual updates, not just one single burst of data...and, therefore, it must not close the TCP pipe once the first document has been transmitted. This is accomplished by specifying the MIME type of the document as being “multipart/x-;mixed-;replace”. In addition, we need to tell the browser how to differentiate between the documents in the stream of multiple documents about to be transmitted. This is accomplished by attaching the qualifier “boundary=SoMeRaNdOmTeXt” to the MIME-type declaration. This tells the web browser that anytime it encounters the sequence of bytes “--SoMeRaNdOmTeXt” in its input stream, it should stop and assume that the following data will describe a new document that will replace the one which currently exists in the document window.

The string that separates the end of one document transmission and the beginning of the next is usually referred to as a boundary token, and this token is normally much more complicated than the one shown in our example here. Normally it is a 50- or 60-byte-long alphanumeric string generated by a randomizer function and will be presented later in this article. The string should be sufficiently long and its contents sufficiently random so as to minimize the chances that it will accidentally occur as part of the body of your document.

Finally, once the document has been pushed to standard out and the boundary token has also been pushed out, it is necessary to flush the output buffer in order to ensure the data gets sent to the client browser. If this is not done, the data will not be sent until the stream's buffer implementation that your operating system uses overflows, and a flush is triggered by the operating system.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Web Servers and Dynamic Content

Anonymous's picture

It is nice to see the CGI in C or C++. I just
noticed that there is a cross platform C++ CGI scripting
in Ch, a C/C++ interpreter.

They have nice APIs in C/C++, which is similar to ASP and
JSP. It is worth checking.

http://www.softintegration.com/products/toolkit/cgi/

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix