At the Forge - Incremental Form Submission

A creative solution for solving Web service performance bottlenecks, or “ix-nay on the ottleneck-bay”.
Improving Performance

Now we come to the hard, or interesting, part of this project. If you can imagine that each Pig Latin translation takes ten seconds to execute, but less than one second to retrieve from the cache, you would want the cache to be used as much as possible. Moreover, given how long each word lookup takes, users will need a great deal of patience to deal with it.

The solution? Use Prototype, a popular JavaScript framework. Its AjaxUpdater will submit the contents of the textarea widget to a URL of your choice automatically—in this case, the same one that is used for POST—in the background, each time the text area is changed. Then, each word is translated while the user is filling out the text form, dramatically reducing the time needed to translate.

In other words, I'm betting it will take enough time for users to enter the entire sentence, that I can collect and translate most or all of the translated words while they're typing. Also, because I know that the Web service is caching results, I can pass the contents of the entire textarea every few seconds, knowing that retrieving items from the cache is extremely rapid.

The key to this functionality is the use of the Form.Element.Observer object in JavaScript. This object allows us to monitor any form element over time, submitting the form's contents to an arbitrary URL when the form element changes. We will use this, along with our knowledge that the Pig Latin server (pl-server.rb) caches words it has already translated, to submit the form every few seconds, even before the user clicks the submit button.

We do this by adding an id attribute, whose value is words, to our textarea, and also by adding the following JavaScript code:

new Form.Element.Observer($("words"), 3, translateFunction);

In other words, we will check the words in textarea for changes every three seconds. If something has changed, the browser invokes the method translateFunction. This function is defined as follows:

function translateFunction() {

var myAjax = new Ajax.Request(
        parameters: Form.serialize('form')

In other words, translateFunction creates a new Ajax request in the background, submitting the contents of the form to the URL /pl-words.cgi—the same program to which the form will be submitted at the end of the process. But, for our incremental submissions, we care more about the side effects (that is, the cached translations) than the resulting HTML. So, we ignore the output from pl-words.cgi.

Because of how we built our server-side programs, they don't need to change at all in order for this Ajax-style addition to take effect. All we need to do is modify the HTML file, adding a few lines of JavaScript.

Now, of course, this doesn't change the amount of time it takes to translate each word or even an entire sentence. But, that's not the point. Rather, what we're doing is taking advantage of the fact that many people tend to type slowly and that they'll take their time entering words into a textarea widget.

If users type quickly, or enter a very short sentence, we haven't really lost anything at all. It'll take a long time to translate those people's sentences, and they'll just have to wait it out. If people change their minds a great deal, it's possible we'll end up with all sorts of cached, translated words that are never going to be used again. But, given that the cache is shared across all users, it seems like a relatively small risk to take.

There are some things to consider if you're thinking of going this route—that is, combining an incremental form submission with a cache. First, notice we are iterating over each word in the textarea. This means there's the potential for someone to launch a denial-of-service attack against your server, simply by entering ridiculously long text strings into your textarea widget. One way to prevent this is to limit the number of words you check from any given textarea widget. You can, of course, limit the number of words you're willing to translate from the incremental submission, rather than from the complete and final submission.

Another item to remember is that you should not expose your inner APIs. APIs are for external use; the moment people know your internal data structures and methods, they might use them against you. These examples didn't include any cleaning or testing of the data that was passed to the server; in a real-world case, you probably would want to do that before simply passing it along to another program.

Finally, if your site becomes popular, you might need more than one server to handle Web services. That's fine, and it's even a good idea. But, how many servers should you get, and how should they store their data? One possibility, and something that I expect to write about in the coming months, is Amazon's EC2 (Electric Computing Cloud) technology, which allows you to launch an almost limitless number of Web servers quickly and for a reasonable price. Combining EC2 with this sort of caching Web service might work well, especially if you have a good method for sharing dynamic data among the servers.