XMLTP/L, XMLTP Light
XMLTP/L, or XMLTP Light, is a lightweight RPC protocol that uses XML to encode the stream of data. XMLTP/L has been designed to do fast RPC calls over an intranet, within an enterprise. More specifically, the first purpose of XMLTP/L is to forward transactions (RPCs) to a database server. But, it also can be used to do method calls to any server that follows the common RPC technique introduced by XML-RPC and older client/server protocols.
The origins of XMLTP/L come from the inherent limitations associated with proprietary client/server protocols: basically, they are proprietary. XMLTP/L is distributed as source code and is licensed under the GNU Library General Public License (GNU LGPL).
Because of the languages used in the current implementation of XMLTP/L, developers who use C, Python and Java should find this article more interesting. But, we expect ports will be made to other scripting and web application languages, such as PHP and Ruby.
The name XMLTP Light evokes the design goals and purposes of this protocol:
transport TP Light (see Note 1) remote procedure calls (RPC);
use an XML syntax to allow compatibility with various tools, extant or future;
have good performance, with modest requirements (therefore, real practical scalability using simple hardware configurations); and
have a lightweight and robust implementation.
To achieve its goal of being fast and lightweight, XMLTP/L does not try to do everything and acknowledges the following constraints and limitations as non-goals:
it uses a subset of XML (a full-fledge XML parser can parse XMLTP/L, but not the reverse);
it is not a universal transport for all datatypes;
it is not the most standardized or the most buzzwords-compliant technology to appear now, yesterday or tomorrow; and
it is not as easy to install as XML-RPC (which is less than 20KB of code in the Python 2.x standard installation).
XMLTP/L is much more boring than other XML protocols, such as SOAP or XML-RPC, or any newer web services protocols that will appear real-soon-now. But, if it fills someone's need for a non-proprietary, robust and flexible RPC protocol on their intranet, then that would be great.
XMLTP/L should be appealing to people who use stored procedure calls and want more flexibility (like integrating multiple DBMS or building a custom non-SQL dataserver). It works for people who need some level of XML compliance without suffering response time greater than 300 or 500 ms, as is the norm for all the implementations of RPC-over-XML that I have tried so far. I have not tried all of them but enough to convince me to write a custom XML subset parser in Bison and C.
Figure 1 shows a schema of XMLTP/L server inteconnection. Only one dataserver is shown, but it is possible to put two or many behind the RRGX, a gateway that does RPC Routing (more about RRGX below). Also in Figure 1, you can see a small box called XML-RPC or SOAP bridge. This piece does not exist yet. But, it would be rather easy to create such a bridge, as the initial specs for XMLTP/L are similar to those for XML-RPC.
The current implementation of XMLTP/L uses plain TCP/IP socket for transport, and the connections are persistent. In other words, once a socket connection is established, it is used for many RPC calls. This situation is different from XML-RPC, where a new HTTP connection (a new TCP/IP connection) typically is made for each RPC call. This is one of the reasons why XMLTP/L is faster than XML-RPC.
Furthermore, the RRGX gateway and the Java classes (RPC call) use connection pools. This technique is another boost for performance when many threads of an application server make many RPC calls.
The lightness of the implementation is a third factor for good speed. XMLTP/L is a simple XML-based protocol, and it supports only four datatypes in the RPC parameters and in the tabular result set(s), which can be returned by the stored procedures. These datatypes are:
integer: 32 bits, most likely (Python allows for larger *int*, but XML2SYB uses ANSI "C" 32 bits *int* on GNU/Linux on x386)
float: double precision, most likely
timestamp: "YYYY-MM-DD HH:MM:SS.mmm"
string: max of 255 characters.
For any of these datatypes, a value can be flagged as *Null*. In this case, the data associated with that value should be sent empty by the server, and it should be ignored by the client.
If we look at all the details, a XMLTP/L response contains a return status (always), one or many result set(s) (optional), a messages(s) (optional) and an output parameter(s) (optional).
The XML syntax of both the RPC call and the response in XMLTP/L is lighter than the equivalent in XML-RPC. Here is an exampe of a RPC call in XMLTP/L:
<?xml version="1.0" encoding="ISO-8859-1" ?> <procCall<proc>RPC_PROC_1</proc> <param> <name>@out_param</name <int>-9999</int> <attr>114</attr> </param> <param> <name>@param_1</name> <str>adenosine</str> <attr>0012</attr> </param> </procCall>
Here's the same RPC call, using the XML-RPC syntax:
<?xml version='1.0'?> <methodCall><methodName>RPC_PROC_1</methodName> <params> <param> <value>array><data> <value><string>@out_param</string></value> <value><int>-9999</int></value> <value><string>114</string></value> </data></array></value> </param> <param> <value><array><data> <value><string>@param_1</string></value> <value><string>adenosine</string></value> <value><string>0012</string></value> </data></array></value> </param> </params> </methodCall>
The difference in the byte count is even more dramatic when the RPC call has more parameters and also in a response with a large number of rows.
A main design goal of XMLTP/L was to pump a high volume of transactions in and out of a relational database. We wanted some level of vendor independance, specifically so users would suffer less from the vendors' frequent upgrades to their native client-server APIs. Also, we wanted to be able to mix datasources and perform RPC routing in a way that would be invisible to the web application server or to other XMLTP/L clients connected to the RRGX.
XMLTP/L acts like a universal database RPC protocol. The whole intranet can use XMLTP/L for RPC calls, and vendors' proprietary protocols can be isolated on a server near the database itself. The RRGX (RPC Router Gateway for XMLTP/L) can integrate multiple database servers together as if if it was a single server. Be aware that XMLTP/L does not automate synchronisation of transactions on multiple dataservers; each RPC is independent of all other calls.
The RRGX does not use any proprietary client/server APIs. This RRGX allows calling procedures in multiple database servers as it routes the RPCs to the various converter programs, according to the names of the procedures. Those routing rules are defined in the configuration file of the RRGX.
The RRGX needs one converter per database protocol. Such converter programs are external programs built from the vendor's API libraries and the XMLTP/L modules. Currently, XML2SYB (XML to Sybase) is already implemented.
It also is possible to code stored procedures in Python in a server similar to the RRGX, which is derived from the generic gxserver.py module. Many other small features are in the RRGX that make it useful in a real production environment:
log of events and messages (error, warning, trace & debug)
dynamically adjustable trace level
built-in operator commands, such as ps, who, stats and so on
gates that can be closed while maintenance is done on the database(s) behind the gateway:When a gate is closed, a user-defined message can be sent back to the clients.
RPC gate (global)
connection pools gates
connection gate (global)
application specific gates (one gate for a list of RPC names).
RPC result interceptions (with passive queueing or active re-forwarding)
The choice of only four simple datatypes shows that XMLTP/L does not attemp to solve many of the needs of typical multimedia applications. It probably would be quite easy to support a BLOB (binary large object) datatype, but this is not a priority at this moment.
Another limitation in this first implementation is the currently supported character set is ISO-8859-1 (Latin European languages). But, given the fact that this is a simple protocol (and the fact that the source code is available under the GNU LGPL license), we expect other developers to adapt it to their own cultural needs.
The XML syntax used in our project is much more compact than the one used by XML-RPC as of August 2001, when we started designing and building XMLTP/L. A typical RPC call stream or result stream is about half the size when it is encoded with XMLTP/L, because the tags in XMLTP/L are smaller and fewer.
We used four languages--GNU Bison, ANSI C, Python and Java--to do the current implementation. At this time, XMLTP/L can be used by programs written in three languages: Python, C and Java.
The main building blocks of the implementation are:
a grammar suitable for GNU Bison (in other words, a Yacc .y source file). This grammar is done in a way that allows a multithread parser.
a C/Python module that does parsing in a multithreaded way. That xmltp_gx.so module allows users to write both client and server programs in Python. Furthermore, it currently is used to build a gateway server, gxserver.py, that routes RPC calls according to their names. The gateway gxserver.py (or RRGX) has many other features, such as operator commands and RPC gates, that are not fully described here.
a C program, XML2SYB, that converts back and forth between XMLTP/L RPC calls to native Sybase RPC calls (TDS protocol). XML2SYB acts as a XMLTP/L server.
a Java JDBC driver (client side of XMLTP/L) that is suitable for use with Tomcat web applications or other Java programs.
Essential Client/Server Survival Guide, Robert Orfali, Dan Harkey & Jeri Edwards. New York, John Wiley & Sons, Inc., 1994.
"Scale Up with TP Monitors", BYTE, April 1995, pp. 123-128.
*TP Light*: TP Light is doing light OLTP (on-line transaction processing) by using RPCs to call procedures stored in a SQL database. This could be a Sybase, Oracle, IBM DB2 or any other DBMS.
Jean-Francois Touchette has been developing software for 20 years. Since 1985, he has written several gateways and servers with various protocols. He has been using C and UNIX since that time. When Python is suitable, he prefers it.