The Lack of a Small Unified Database

Why a single-file SQL standard format is necessary and how SQLite can get us there.

Many desktop users do not have the skills or, frankly, any real need to install and manage a multiuser database server. They still need, however, to use and exchange SQL databases. Unfortunately, many developers fail to see this situation as a problem and dismiss it, at least initially, with any combination of these responses:

  • It now is so easy to install PostgreSQL, MySQL or any other RDBMS server, desktop users should install one of them.

  • Program X also is available for Windows/Mac/whatever, so just install it.

  • If anything, we only need to improve the interface and/or the documentation to do the above.

Nothing technically is wrong with these arguments, but they simply don't apply to the niche sector of soho and corporate users. Nobody in this sector is whining because installing a server requires two or one mouse click instead of zero. But there is a huge difference between a perfect, completely documented install wizard for any SQL server and what those users currently are getting.

Everybody says how great file sharing is. Okay, so answer this one: how can a Linux/KOffice geek share his book or recipe database with his aunt, who is running Windows/OO.o? How can an employee send a product database from his MAC/OO.o desktop to a potential corporate customer running Solaris/KOffice? What if the receiver has no root password or permission to install extra software, the standard situation in most offices? In general, in the real world, the "just install and configure this" attitude doesn't make sense and doesn't help information to flow freely. This attitude can be just as impractical, if not impossible, as having to install a new font or print server simply to open a text file.

Currently, free software users are missing a single-file SQL standard format, which may be a tar or ZIP archive, that contains everything needed by a generic frontend to let people work: schemas, data, indexes, forms structures and so on. Such databases could be copied immediately, uploaded to a Web server or sent by e-mail, the same as any other file. Users would have the certainty that the receiver immediately could access all the data, queries and forms, even if they might look different. Above all, it would be great if such a file format became an OASIS standard, because it would make it much easier to accept in corporate or government scenarios.

In the text/spreadsheet/presentation space, the Right Thing already is happening. The two most popular free office software suites, OO.o and KOffice, are converging on the same default file format, which is an OASIS standard. This means being able to write, read and share such documents today between OO.o KOffice and tomorrow with any other OASIS-compliant application--transparently. This level of standardization also gives much more credibility and strength to Free Software.

Wouldn't it be really great and isn't it time to do the same thing for simple SQL databases? Without, of course, preventing anybody who wants a full blown RDBMS daemon from using it? Today, such databases are not covered or influenced by OASIS. I say that they should be. The rest of this article proposes a way to achieve this goal.

Status in OO.o

The first two applications that should converge on this database standard are OpenOffice.org (OO.o) and KOffice. OO.o has data sources, meaning it can connect to external RDBMS servers and can use single-file databases in dBase format, whose features simply are too limited. When I started to investigate this matter, I learned that OO.o developers already have begun working on improving support for a server-less database engine. Standardization, however, simply isn't among their objectives right now, though. What they anticipate today to include in OO.o 2.0 (alpha snapshot available here) is a database file format that is XML-based and that contains everything except the actual data (forms, reports, queries and administrative information).

When I asked, I was told that the most probable file format choice is HSQLDB, mainly because it supports more features than its competitors do. Personally, I am against this choice for four reasons. The first is performance (read more here and here), especially considering that OO.o doesn't need to remain as heavy as it is today. The second reason is HSQLDB requires Java, and I don't like the idea of depending on third-party elements, as it makes it more likely that these single-file DBs don't work in practice when moved from PC to PC. The third reason is many other application and languages in the free software arena have partially converged on something else (more on this in a moment), so I think OO.o should be a good community citizen and follow suit. Hence, my fourth reason: by not proposing a portable standard in OO.o, OO.o users would be in the position of saying to everybody else "yes, we are using this so called "free" software, but if you want to share small databases transparently, please force yourself to freely install OO.o, HSQLDB/Java or any combination of the above".

______________________

Articles about Digital Rights and more at http://stop.zona-m.net CV, talks and bio at http://mfioretti.com

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I think it should be mentione

Antoine's picture

I think it should be mentioned (a long time after the release of this article) that koffice is now standardising on the same format as OO. I presume that this will mean they will naturally move to the same native database format. So the standardisation is happening. Maybe this article helped?
Cheers
Antoine

Re: The Lack of a Small Unified Database

Anonymous's picture

You forgot to mention that there's already a native OO.o 1.1.x
SQLite driver available (alpha-version, only tested on Linux).

More information here.

Re: The Lack of a Small Unified Database

Anonymous's picture

I submitted this article some time ago. As far as I remember (should check
my notes at home to be sure), that page had not been mentioned back then by any
of the developers I spoke with, but I might be wrong.
Thanks for pointing it out, of course it helps to reach the unified database
goal

Marco F.

unified database

eric drake's picture

In the DOS world I used to use a simple integrated package called Alpha Works. It had spreadsheet, database, word processor, communications, shell access all in one piece of software. The database format was dBase III+. When I moved to Windows, I began using Microsoft Works. The database service (or should I call it the "table" service) provided filtering, sorting, some formulas that were similar to what was available in the spreadsheet, table reporting and a form view. I have been using it for years and it has become my favorite application. The spreadsheet is more than adequate (most users aren't plotting space craft trajectories or performing complex matrix algebra). Excel has more than most users need and so does Word. Oddly enough, Works was free with the computer but I have yet to find anybody around me who uses it. They all went out and bought Excel and Word and the fact is they do far less on their systems than me. I am a sales rep handling multiple companies and multiple customers who buy some lines but not others. The database files aren't large 2 - 3000 records. The MSWorks program works totally in memory so sorting and filters are instantaneous. I can cut and paste records easily into the spreadsheet and into my Windows version of Open Office which I used to set up the order forms that my companies require (each requires a different order form). I use Open Office for this because the Works spreadsheet doesn't allow the embedding of graphics my orders often require graphic representations of what I need. I export the forms to pdf and email them to the various companies. So what is Linux missing ? (I'm writing this on my Linux system now) Something like Microsoft Works. A full integrated set of applications that isn't overloaded with features most people never use, allows easy cutting and pasting as well as database field embedding in form letters, has a small footprint, runs in memory and can export and import all the basic file formats (as Microsoft Works can). Then I would ditch Windows for good. Not that everything would be great with Linux. There's still the problem of driver support especially wireless USB and printers. My laptop's Linksys USB 1.0 802.11.b wireless network device is not recognised and my workhorse cheap Konica Minolta laser printer has no drivers for Linux. I know there is a ton of software available for Linux but the average user doesn't need a ton. A Swiss army knife with a couple of good blades, a scissors and a screwdriver covers alot of needs. Keep the corkscrew and the toothpick.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState