Beagle SQL, A Client/Server Database for Linux

Mr. Klein introduces us to a database called Beagle SQL that he developed as a learning experience.
Connection Manager

This is probably the most straightforward piece of the client-server puzzle. The Connection Manager is simply a loop waiting for incoming messages from client processes. First a socket is opened for the “beagled” service (defined in your /etc/services file), so that the Connection Manager can listen for incoming connections. Then an endless loop is entered. Once the Connection Manager receives a signal from a client processes, a call to accept() returns the socket number that the client and server processes will communicate through. At this point, the Connection Manager fork()s the Database Manager, passing the socket number returned by accept(). After the Database Manager is successfully started, the Connection Manager starts listening for the next incoming connection.

Database Manager

The Database Manager does all the work. The basic components of the Database Manager are the expression parser, the query optimizer (currently, no query optimization is done in BSQL), the index manager, the locking manager and the low-level I/O manager. The SELECT statement is the most complex operation performed by the Database Manager. As BSQL supports explicit joins, a single SELECT statement can search through several tables to return the requested information. The expression parser must be intelligent enough to tell which fields in the SELECT list belong to which tables. If you are joining two tables with duplicate field names, the SELECT statement must explicitly state which field belongs to which tables. Wild cards are allowed. When the expression parser sees wild cards in the field list, it inserts the appropriate field names into the list.

There are three examples in the sidebar to give you an idea of how the expression parser does its work. The statement in Example 3 fails because field1 is ambiguous. The expression parser can't tell if it belongs to table A or table B as both have a field called field1.

Examples

When joining tables, the SELECT statement's WHERE clause can contain several different parts that need to be treated separately. When joining two tables these parts can include the conditionals for the first table, the conditionals for the second table and the join condition. The Database Manager searches each of the two tables using the appropriate conditionals from the WHERE clause. Next, it joins the two tables into a temporary table using the fields in the SELECT field list as well as the fields used in the join condition. Last, the records in the temporary table are matched with the join condition and the appropriate records are made available for retrieval by the client process.

This operation can get quite complicated and time consuming when dealing with large and multiple tables. This is where the query optimizer comes in. Its purpose is to determine the most efficient order to search and join the tables. BSQL currently doesn't do query optimization and joins the tables from left to right as they appear in the SELECT statement. This leaves it up to the person writing the SQL statement to put some thought into the order the tables appear in the join.

When performing searches, the Database Manager uses a set of low-level I/O routines to retrieve records from the database. Most commercial database vendors use proprietary file systems to house their databases. In the case of BSQL, the Linux file system sufficed. A future enhancement will be a flexible file format that can allow for such things as BLOBs, images, text documents and anything else. (A BLOB is a large binary datatype used to store images, sound bites, programs, etc. in a database table.)

The method used to store these variable-length records will significantly impact the performance of the Database Manager. When a record is written to the database, it is broken down into fixed-size segments. The database administrator can set the size of these segments for each database. If a record containing 850 bytes is written to a table that uses 256 byte segments, it is broken down into four segments that are chained together. If at a later time the record size is changed to 1200 bytes, an additional segment is added to the chain. If the record is reduced to 700 bytes, the unneeded segments are marked for reuse. One drawback here is that over time the database can become fragmented. Routine maintenance using a de-fragmenting utility should be performed on databases that see a lot of UPDATEs and DELETEs. This utility will be provided with the first official release of BSQL v1.0.

Conclusion

Hopefully, I've given you some insight into how client server databases work and the many late nights that go into their development. For more information on Beagle SQL, point your web browser to http://www.beaglesql.org/. Here you can follow its development history from day one to present as well as download the code to try it out for yourself. Also, be sure to look into the other freely available database resources on the Web.

Rob Klein has been a Developer/Administrator for 11 years. Although software development can be fun, his main interests are football, baseball and spending quiet evenings with this wife Cathy (absolutely not in that order). He welcomes your comments sent to rvklein@ober.com.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

SQL SMO (SQL Management

Black Jack Schule's picture

SQL SMO (SQL Management Object) is a collection of assemblies that shipped together with SQL 2005. This collection of assemblies is all that you need to control your SQL Server. Be it SQL 2005, or SQL 2000, you can control/manipulate it programmatically using SQL SMO.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState