Protozilla: Pipes, Protocols and the Web Browser

Protozilla allows client-side CGI to extend Mozilla-based web browsers.
URL Redirection and Helper Applications

Protozilla can also be used to implement new URI schemes using URL redirection. For example, we may define a scheme called moz to access documents in the mozilla.org web site (i.e., the URI moz:docs gets redirected to the URL http://mozilla.org/docs). In effect, we would like the string http://mozilla.org/ to be prefixed to the data portion of any URI using the moz: scheme. This is very simple to implement using Protozilla's URL redirection feature. All you need to do is create a file named moz.url in the protocols directory containing the following line:

http://mozilla.org/

In the previous examples, we described how Protozilla could be used to implement protocols where the output is displayed in a browser window. There are situations where we would like a URI simply to trigger an action, such as launching a "helper" application window without using the browser interface. A common example is the Telnet protocol, where we would like a URI of the form telnet://hostname to fire up the Telnet client. To implement this using Protozilla, you create a file named telnet.cmd containing the following line:

telnet $URI_HOST $URI_PORT

Protocol handlers with the .cmd extension contain the command to be executed when the URI for the protocol is encountered. The environment variables URI_HOST and URI_PORT contain the hostname and port number information extracted by parsing the data portion of the URL.

Security Issues

There are some important security issues to consider when implementing new URI schemes using Protozilla. HTML documents downloaded from the Web using the browser could potentially contain hyperlinks using the new URI schemes. Loading of these hyperlinks could be triggered either by the user clicking on them or by Javascript executed in the web page. This means that the protocol handler should screen the data portion of the URI carefully for malicious input before carrying out the action requested by the URI. These security needs are quite similar to those that arise in the context of CGI programs executed by web servers. As illustrated in the example script whois.pl, scripting languages like Perl already provide mechanisms like taint-checking to address these needs.

Protozilla provides an additional security mechanism that allows the user to define restricted URI schemes. All schemes with names ending with a plus (+) character are automatically considered restricted. An example is the cgi+: scheme, bundled with Protozilla, which allows execution of CGI programs. A restricted URI may only be loaded by privileged scripts residing on the user's local disk or by the user typing the URI into a special user interface. Content downloaded from the Web is not allowed to trigger the loading of a restricted URI.

Technical Details

Here is a brief technical summary of Protozilla's internals. Mozilla communicates with the outside world using TCP sockets and multithreaded asynchronous I/O. What Protozilla does is execute an external program as a separate process but make it appear to Mozilla as if it were a TCP socket. Protozilla communicates with the process using anonymous pipes and synchronous I/O but uses asynchronous I/O to communicate with the rest of Mozilla. The core portion of Protozilla, which deals with pipes and inter-process communication, is written in C++. The rest of Protozilla, dealing with protocols, is implemented in Javascript.

Of course, an out-of-process implementation of a protocol is never going to be as efficient as a multithreaded, in-process implementation within Mozilla. However, there is a trade-off between efficiency and ease of implementation. Protozilla is better suited for experimental protocol implementations, rather than for mature protocols. Out-of-process CGI program execution can be quite slow on a web server serving a large number of clients, but it is not as serious a performance issue on a client that is serving just one user. An out-of-process implementation does have one advantage--if an experimental protocol implementation hangs or crashes, it will not affect the rest of the browser.

Applications of Protozilla

The main goal of the Protozilla project is to make it easy for the browser to interact with existing software, without the need for extensive modifications to conform to any browser-specific interface. The client-side CGI feature of Protozilla essentially allows any command-line program to be invoked within the browser. This feature may also be useful for building "local web applications" that use Mozilla as the user interface, rather than just as an HTTP client. For example, if you have a Perl script that parses your e-mail, you could choose to have it display its output in the browser.

Another goal of Protozilla is to allow seamless access to peer-to-peer (P2P) protocols from the browser. Since the browser is by far the most commonly used software for accessing remote content, implementing new URI schemes for P2P protocols should make them much more accessible to the average user. Not only will the user be able to access the protocols by typing a P2P URI in the browser's URL bar, but the clickable P2P URIs may also be embedded in e-mail messages and other documents and exchanged just like HTTP or FTP URIs. As an example of P2P applications, the Protozilla distribution comes bundled with a Freenet URI scheme. If you have the proxy server software for the P2P Freenet protocol installed on your computer, you will be able to use URIs of the form freenet:<key> to browse documents on Freenet.

______________________