Beagle SQL, A Client/Server Database for Linux
This is probably the most straightforward piece of the client-server puzzle. The Connection Manager is simply a loop waiting for incoming messages from client processes. First a socket is opened for the “beagled” service (defined in your /etc/services file), so that the Connection Manager can listen for incoming connections. Then an endless loop is entered. Once the Connection Manager receives a signal from a client processes, a call to accept() returns the socket number that the client and server processes will communicate through. At this point, the Connection Manager fork()s the Database Manager, passing the socket number returned by accept(). After the Database Manager is successfully started, the Connection Manager starts listening for the next incoming connection.
The Database Manager does all the work. The basic components of the Database Manager are the expression parser, the query optimizer (currently, no query optimization is done in BSQL), the index manager, the locking manager and the low-level I/O manager. The SELECT statement is the most complex operation performed by the Database Manager. As BSQL supports explicit joins, a single SELECT statement can search through several tables to return the requested information. The expression parser must be intelligent enough to tell which fields in the SELECT list belong to which tables. If you are joining two tables with duplicate field names, the SELECT statement must explicitly state which field belongs to which tables. Wild cards are allowed. When the expression parser sees wild cards in the field list, it inserts the appropriate field names into the list.
There are three examples in the sidebar to give you an idea of how the expression parser does its work. The statement in Example 3 fails because field1 is ambiguous. The expression parser can't tell if it belongs to table A or table B as both have a field called field1.
When joining tables, the SELECT statement's WHERE clause can contain several different parts that need to be treated separately. When joining two tables these parts can include the conditionals for the first table, the conditionals for the second table and the join condition. The Database Manager searches each of the two tables using the appropriate conditionals from the WHERE clause. Next, it joins the two tables into a temporary table using the fields in the SELECT field list as well as the fields used in the join condition. Last, the records in the temporary table are matched with the join condition and the appropriate records are made available for retrieval by the client process.
This operation can get quite complicated and time consuming when dealing with large and multiple tables. This is where the query optimizer comes in. Its purpose is to determine the most efficient order to search and join the tables. BSQL currently doesn't do query optimization and joins the tables from left to right as they appear in the SELECT statement. This leaves it up to the person writing the SQL statement to put some thought into the order the tables appear in the join.
When performing searches, the Database Manager uses a set of low-level I/O routines to retrieve records from the database. Most commercial database vendors use proprietary file systems to house their databases. In the case of BSQL, the Linux file system sufficed. A future enhancement will be a flexible file format that can allow for such things as BLOBs, images, text documents and anything else. (A BLOB is a large binary datatype used to store images, sound bites, programs, etc. in a database table.)
The method used to store these variable-length records will significantly impact the performance of the Database Manager. When a record is written to the database, it is broken down into fixed-size segments. The database administrator can set the size of these segments for each database. If a record containing 850 bytes is written to a table that uses 256 byte segments, it is broken down into four segments that are chained together. If at a later time the record size is changed to 1200 bytes, an additional segment is added to the chain. If the record is reduced to 700 bytes, the unneeded segments are marked for reuse. One drawback here is that over time the database can become fragmented. Routine maintenance using a de-fragmenting utility should be performed on databases that see a lot of UPDATEs and DELETEs. This utility will be provided with the first official release of BSQL v1.0.
Hopefully, I've given you some insight into how client server databases work and the many late nights that go into their development. For more information on Beagle SQL, point your web browser to http://www.beaglesql.org/. Here you can follow its development history from day one to present as well as download the code to try it out for yourself. Also, be sure to look into the other freely available database resources on the Web.
- My Childhood in a Cigar Box
- Tech Tip: Really Simple HTTP Server with Python
- Papa's Got a Brand New NAS
- Applied Expert Systems, Inc.'s CleverView for TCP/IP on Linux
- Rogue Wave Software's TotalView for HPC and CodeDynamics
- Returning Values from Bash Functions
- Panther MPC, Inc.'s Panther Alpha
- Simplenote, Simply Awesome!
- NethServer: Linux without All That Linux Stuff