Quantcast
Username/Email:  Password: 

SQLite 3.0.7

 in
New features include UTF-8 support, manifest typing and improved concurrency.


D. Richard Hipp's SQLite database engine has earned a well-respected
place in the toolbox of many programmers. Its small size
and simple distribution make it a natural choice for standalone and
embedded applications. Wide support by many programming
languages, including PHP, has made SQLite popular for Web applications
that need persistent data storage but don't need the kind of
multi-user scaling capabilities provided by server-based solutions.
Unicode Support
Version 3 of SQL adds support for UTF-8 data. If your content doesn't happen
to be in English, this addition obviously is of great benefit to you.
According to the SQLite author, version 3 also has internal support for UTF-16
data. If you need that feature at the moment, however, you
need to implement it yourself or make arrangements with the author to
have that functionality built in.
Data Typing
One of the great peculiarities of the SQLite database has been
its typeless storage--any data of any size can be stuffed into any
field. That feature hasn't changed in version 3, but a new feature called
manifest typing has been added. With manifest typing, the database engine
attempts to make the data stored in a field be of the same general type
as the declared type of the field. The engine still stores any data in any field,
but if the type of the data matches the declared type of the field,
the database engine then stores the data in a binary format that
takes up less space.

Manifest typing means that a programmer can query the database for the
type of an individual field in a result set with the
sqlite3_column_type() function call. The types that are stored are
integer, float, text, blob and null.

Version 3 also adds support for BLOBS. This means, for example,
that it would be perfectly reasonable to store photos of the desired items
or the family members themselves in my family wish list database.
Application Interface
The most apparent change that most previous SQLite users notice is the prefix
changes. SQLite 3 functions all begin with sqlite3_ instead of
sqlite_, as in older versions.

Old methods of data access still are available, including sqlite3_exec()
and sqlite3_get_table(). For the uninitiated, sqlite3_exec takes a
callback function as one of its parameters and calls this
function for each row in the data set. The sqlite3_get_table function
returns the entire results set in a single-character array.

The preferred way to interact with the database in SQLite 3 is the
prepared statement interface. Advanced users of SQLite 2.8.x might
recognize prepared statement routines as an extension of the
experimental callback free vm interface to SQLite.
Prepared Statements
ODBC programmers have been able to use prepared statements for a long
time. In addition to extending the older virtual machine interface to
SQLite, prepared statements offer a number of convenience features for
programmers. For those who haven't seen a prepared statement, an example
might be:


     SELECT id, name, email FROM user WHERE name = ?

The question mark represents a wildcard, a place where real data is
bound before the query actually runs. The query is prepared
once, and for each execution, actual data is bound to the wildcards.
This eliminates the need for careful, paranoid parsing of user-supplied
data for killer characters and potential exploits.

Prepared queries also offer some savings in terms of speed. Internally,
a database must convert a statement to a set of parameters that define
the actual steps to be carried out in the search. This can represent
some computational expense, especially if a single query is executed
repetitively. Removing the compilation step for all but the first
invocation of the query saves processor time. This saved times adds up in
applications such as data conversions or modifications, where hundreds or
thousands of identical queries may need to be run.

The best way to understand the new API is to see it in use. Here, I show
some examples from an e-commerce wish list application on which I'm
currently working. The particular code shown comes from a command-line
utility for generating shopping lists. The complete code for the
example can be downloaded (see Resources).

The first change to note is the changed connection command.


sqlite3* db;

if (sqlite3_open("wishlist.db", &db))
  error_handler(db);

As with other SQLite functions, sqlite3_open now returns an integer
result code rather than a connection handle. The second parameter
then becomes a pointer to a pointer to an sqlite3 structure. Any
result other than SQLITE_OK (0) causes control to flow to my error
handler. That function shows another of the API changes:


void error_handler(sqlite3* db) {

  fprintf(stderr, "Error: %s\n", sqlite3_errmsg(db));
  exit(EXIT_FAILURE);

}

The error string function now returns a pointer rather than asking
for a pointer to a character pointer as a parameter. Although it's not in
keeping with the rest of the library, it's rather pointless to return
an integer error code from this function, so there was no benefit to
putting the programmer through pointer gymnastics.

In my program, I left the statement processing, binding and execution
in the main function and moved the data gathering out to a second
function.


#define ITEM_SQL "SELECT * FROM item WHERE user_id=?"

struct item;

struct item* item_new();
void item_delete(struct item*);
struct item* item_populate(sqlite3_stmt*, struct item*);

int* get_user_ids();
void print_wishlist(struct item*);

int main() {

  sqlite3_stmt* itemqry;
  struct item* wishlist;
  int* user_id;
  int i;

  user_id = get_user_ids(); /* Magic function */

  /* Prepare the statement for use, many times over */
  if (sqlite3_prepare(db, ITEM_SQL, strlen(ITEM_SQL), &itemqry, NULL))
    error_handler(db);
  
  for(i=0; user_id[i]; i++) {
    /* Data binding */
    if (sqlite3_bind_int(itemqry, 1, user_id[i]))
      error_handler(db);
    
    /* Actual data gathering */
    while((rc = sqlite3_step(itemqry)) == SQLITE_ROW) 
      wishlist = item_populate(itemqry, wishlist);

    /* Use the data we've collected */
    print_wishlist(wishlist);
    
    /* Get ready for the next user_id*/
    sqlite3_reset(itemqry);
  }

  /* Final cleanup */
  sqlite3_finalize(itemqry);
  return EXIT_SUCCESS;
}

The code has been commented lightly to make it a little more clear
what is happening. For sqlite function, we start with the statement
preparation. Notice that it is outside of any loops; once we've done
it, we don't need to do it again. The itemqry pointer is the handle
that we're going to use for all of our other steps.

Inside of the for loop, we need to rebind the single wildcard in the
statement each time through the loop. There's no recompilation, which
saves code and processor time.

In the execution step we're responsible for looping over all of the
data to collect it. The sqlite3_step function handles that for us
and returns SQLITE_ROW whenever a call results in actual data. A
more robust program would look for the other possible values, which
include SQLITE_BUSY and SQLITE_DONE. In practical terms, anything
other than SQLITE_ROW means that our data gathering process is done.

Every time I get a row of data, I'm shipping off my prepared statement
to a data handler function, item_populate();. This function is
responsible for extracting the data from the row, populating an item
structure and appending it to our linked list of items.

Once I have my data, I do whatever processing is necessary, which in
this case is a printing function.

Finally, I call sqlite3_reset() on my prepared statement to free up
any consumed by the query. It's worth noting that sqlite3_reset()
doesn't change or remove the wildcard values from the prepared
statement. If you step through the query again, you get the same
result set.

As a last step, here I show you how data is extracted from a result set in SQLite
3. The new retrieval features are much more sophisticated than the
string parsing necessary with previous versions.


struct item {

  int id;
  char* description;
  char externalkey[41];
  struct item* next;

};

struct item* item_populate(sqlite3_stmt* s, struct item* top) {

  struct item* cur;
  struct item* i;
  int size;
  const char* hold;

  i = item_new();
  i->id = sqlite3_column_int(s, 0);
  hold = sqlite3_column_text(s, 1);
  if (hold)
    strncpy(i->externalkey, hold, 40);
  size = sqlite3_column_bytes(s, 2);
  hold = sqlite3_column_text(s, 2);
  if (hold) {
    i->description = (char*)calloc(1, size + 1);
    strncpy(i->description, hold, size);
  }

  /* Linked list handling code */
  if (top == NULL) return i;
  for(cur = top; cur->next ; cur = cur->next);
  cur->next = i;

  return top;

}

The sqlite3_column family of functions extract the data from an
individual column. Columns can be indexed numerically only, so it's a
good idea to know the order in which your columns will be returning. If
you don't know, you can inquire about the name of a specific column
with sqlite3_column_name().

Some of my columns are allowed to have null values, and I need to take
precautions to address that. That's why, rather than copying the
results of sqlite3_column_text() directly to the externalkey element
of my item structure, I assign it to a temporary pointer. This way, I
can check to make sure it isn't a NULL before copying a fixed length
of the string into my array. If you haven't worked with SQLite before,
it's important to remember that the declared size of a string in the
database has no bearing on the size of the string stored or returned.
The strn series of string functions are vital to dealing with this
data.

Finally, we have the code for handling the last column of our data,
which is a potentially large text object. sqlite3_column_bytes()
tells me how large the object is, which lets me allocate sufficient
space for copying the data.

There's plenty more to the API, but hopefully these few examples have
shown you enough to get you going. Features I didn't address include
the ability to define your own collating sequences, implement a
database permissions system and define your own database functions.
Improved Concurrency
Concurrency is the ability for multiple processes to access the same
data set at the same time. Although disk-based databases in general
aren't the first choice for multiple-user applications, it is entirely
possible to use them with multiple processes accessing the same
database. SQLite uses a coarse-grained locking system, requiring
exclusive access to the database file when it is writing but allowing
multiple processes to read from the database.

If a steady stream of processes is reading data, it was possible
under the old version of SQLite for those processes to block a writing process
from getting exclusive access. Under version 3, a process trying to
write to the database can signal its intent and prevent new processes
from starting their reading until the writing process has had its
turn.

In simpler turns, look at the database as a street intersection. If
there isn't much traffic, everything flows smoothly and rapidly even
with only a two-way stop sign to regulate traffic. Now imagine trying
to use that same two-way stop to regulate traffic flow in downtown
Chicago at 5pm on a Friday. If you've been unfortunate enough to be
in that crowd, you know that a stop sign isn't going to suffice. It
takes a traffic cop to keep traffic moving at all.

What this means for you is if you're using SQLite for a
multithreaded or multi-user application, you should spend less time
waiting for locks.
Final Notes
I already was sold on SQLite for a wide variety of applications. The
new features make the database easier to use and suitable for
a wider array of applications. I expect that we'll be seeing SQLite
turning up in more and more places.
Resources
SQLite Home Page

Working Downloadable Example

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: SQLite 3.0.7

Anonymous's picture

One interesting application of SQLite is in the monotone distributed version control system.

Compare with The Lack of a Small Unified Database

Anonymous's picture

I expect that we'll be seeing SQLite turning up in more and more places.
Hopefully, one of them will be as a common format between KOffice and OpenOffice, as explained in The Lack of a Small Unified Database.
Please vote for the corresponding feature requests if you agree:
KOffice
OpenOffice
Thanks for the article,
Marco Fioretti

Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.
  • Use to create page breaks.

More information about formatting options