SQLAlchemy

Although it sometimes might seem as if relational databases have gone the way of the dinosaur, making way for non-relational (NoSQL) databases, such as MongoDB and Cassandra, a very large number of systems still depend on a relational database. And, although there is no requirement that a relational database use SQL as its query language, it's a rare database product that does not do so.

The good news is that SQL is relatively easy to work with, particularly when the queries are straightforward. It's fast and easy to create tables, insert data into them, update that data and write queries that retrieve some or all parts of the data. SQL also makes it fairly easy to combine ("join") information from multiple tables, letting you normalize the data, while keeping speed and flexibility at a maximum.

SQL might not be difficult to work with on its own, but you rarely work with it in a vacuum. Usually, your SQL statements reside within a program you have written. The SQL is kept as a text string within the application and is then sent, via a network socket, to the server.

There are several problems with this. First, it means you have to mix two different languages within the same program. Inside your Web application, which you've worked hard to write, and which you try to ensure is maintainable, you have code in a totally separate language, inside strings, which you cannot test or maintain directly.

Even if the SQL queries weren't written inside strings, you still would be faced with the fact that the majority of your Web application is written in one language, but your data-manipulation routines are written in another language. A Web application contains, no matter how you slice it, components in HTML, CSS and JavaScript, as well as whatever server-side language you're using. Adding SQL to this can only complicate things further.

Even if SQL and a typical server-side language were on equal footing in terms of the syntax of a Web application, there's a fundamental mismatch between the ways in which they handle data. SQL operates with rows and columns within tables; everything in a relational database has to fit into this table-centric view of the world. By contrast, modern programming languages have a rich variety of data structures and typically are object-oriented to some degree or another.

Libraries that bridge the gap between procedural code and SQL are known as object-relational mappers, or ORMs. ORMs typically represent database records as instances of a particular class. In order to represent 50 records, you would need 50 instances of a class, with the state of each instance reflecting the names, types and values of the columns in that record.

There are two basic paradigms for passing data between the object-oriented data structures and the database, both of which were described by Martin Fowler. In the first paradigm, known as Active Record, each instance is tied directly to a row in the database, and the class itself (as well as each object) is responsible for ensuring that the data is saved to the database. In other words, Active Record requires that you create a single class, and that it handles both sides of the object-relational divide. The Active Record class in Ruby on Rails is (not surprisingly) an implementation of this paradigm and provides a great deal of power and flexibility.

A second paradigm is known as Data Mapper, and it requires the use of three different object classes: a class that represents the data itself at the object level, a class that represents the database table and a "mapper" object that acts as a go-between, ensuring that the object and relational parts of the system are appropriately synchronized.

An excellent and popular example of the Data Mapper paradigm can be found in the SQLAlchemy project. SQLAlchemy has been around for a number of years already, and makes it possible to work with relational databases flexibly from within your Python program, without having to write any SQL.

In this article, I take a look at SQLAlchemy, exploring a number of its options and features, and considering how it can be used in Web and other applications.

Connecting to a Database

Installing SQLAlchemy should be straightforward to anyone who has installed Python packages before. You can get SQLAlchemy from PyPi, the Python Package Index, either by downloading it from http://pypi.python.org or by using the easy_install or pip programs to retrieve and install it. I was able to install it with:


pip install sqlalchemy

You might need to install SQLAlchemy as root. Or, you can install it into a virtual machine, using the popular virtualenv package for Python, which gives you nonroot control and permissions over a Python environment.

You also will need to install a driver for the database you intend to use. My favorite relational database is PostgreSQL, and I use the psycopg Python driver, also available on PyPi and (by extension) via pip.

I should note that although I know SQLAlchemy works with Python 3, much of the work I do nowadays is still in Python 2, mostly because that's what my clients are using. My examples, thus, also will be in Python 2, although I believe they will work in Python 3 with little or no change.

Let's assume you have a database table, People:


CREATE TABLE People (id SERIAL PRIMARY KEY,
                     first_name TEXT,
                     last_name TEXT,
                     email TEXT,
                     birthday DATE);

Let's also add some initial records:


INSERT INTO People (first_name, last_name, email, birthday)
    VALUES ('Reuven', 'Lerner', 'reuven@lerner.co.il', '1970-jul-14'),
           ('Foo', 'Bar', 'foobar@example.com', '1970-jan-1');

In order to access this table using SQLAlchemy's ORM, you first need to create a database session object, which itself must be created using an "engine". Each database driver has its own style of URL. In the case of PostgreSQL accessed via the psycopg2 driver, you would use something like this:


dburl = 'postgresql+psycopg2://reuven:reuven@localhost/atf'

This URL indicates not only the database and driver type, but also my user name and password ("reuven" in both cases), the hostname (localhost) and the name of the database I'll be accessing ("atf"). If the database is not available at the default PostgreSQL port of 5432, you can specify that as well in the URL.

You then tell Python to create a new engine based on this URL:


from sqlalchemy import create_engine
engine = create_engine(dburl)

Now that you have the engine defined, you can create a session based on this engine. Doing so requires two steps: first you create a new, custom Session class for this engine, and then you create an instance of the Session class that you will use to access the database:


from sqlalchemy.orm import sessionmaker # import sessionmaker class
Session = sessionmaker(bind=engine)     # make custom session type
session = Session()                     # make instance of session

You're now connected to the database! But, that's not quite enough. If you want to map your database table to one or more Python objects, you need to define a class. You do this by defining a normal Python class, with a few subtle changes:

  • The class must inherit from Base, a class returned from the declarative_base function provided by SQLAlchemy.

  • The database columns must be defined as class attributes, as instances of the SQLAlchemy-provided Column class.

  • You connect the class with your database table by defining the __tablename__ class-level attribute.

For example, the following Python class provides a mapping to the People database table:


from sqlalchemy import Column, Integer, String, DateTime
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Person(Base):

    __tablename__ = 'people'
    id = Column(Integer, primary_key=True)
    first_name = Column(String)
    last_name = Column(String)
    email = Column(String)
    birthday = Column(DateTime)

    def __init__(self, firstname, lastname, email, birthday):
        self.first_name = firstname
        self.last_name = lastname
        self.email = email

It might not be obvious at first glance, but this class implements the Data Mapper design pattern. The class attributes that you have defined describe the columns in the database table and can contain a great deal of detail, including indexes, uniqueness requirements and even custom integrity constraints, such as those provided by PostgreSQL. The class itself is a standard Python class.

But behind the scenes, because this class inherits from Base, you get a number of other attributes, including __mapper__, which indicates how your Python class will be mapped to the database table. You can see this in an interactive Python shell by asking to see the printed representation of Person.__mapper__:


Person.__mapper__
<Mapper at 0x10af5ef90; Person>

You now have a session that connects to the database and a table in the database that has been described in Python. You now can execute a query against your table:


for p in session.query(Person):
    print p.first_name

That gives the following:


Reuven
Foo

In other words, session.query is executing a query against the database, without you having to specify the SQL. You also can restrict the records you'll get, by chaining the filter_by method to your query:


for p in session.query(Person).filter_by(id=1):
    print p.first_name

That gives the following:


Reuven

Note that the filter_by method is not acting on the results of session.query. Rather, it is changing the SQL that eventually is sent to the database. You can see this by assigning printing to the query object without executing it or putting it in an iteration context:


print session.query(Person).filter_by(id=1)

SELECT people.id AS people_id,
       people.first_name AS people_first_name,
       people.last_name AS people_last_name,
       people.email AS people_email,
       people.birthday AS people_birthday
FROM people
WHERE people.id = :id_1

You also can see from this query that SQLAlchemy binds parameters to variables inside your query, rather than directly placing your values. Not only does this allow you to re-run queries later with different variable values, but it reduces the possibility that you will suffer from an SQL injection attack, which still is surprisingly common.

You also can order the results:


for person in session.query(Person).order_by('first_name'):
    print person.first_name

Foo
Reuven

And, you also can do all of the basic "CRUD" activities that you would expect. For example, you can create a new instance of your Person class and then save it to the database:


p = Person('newfirst', 'newlast', 'new@example.com', '1-jan-2012');
session.add(p)
session.commit()

Notice how I can handle multiple inserts (or other actions) inside a single transaction by only issuing session.commit() after adding several objects. Similarly, I can update the object and the corresponding row in the database:


p.first_name = '!!!'
session.add(p)
session.commit()

I also can delete the object:


session.delete(p)
session.commit()

______________________

Reuven M. Lerner, Linux Journal Senior Columnist, a longtime Web developer, consultant and trainer, is completing his PhD in learning sciences at Northwestern University.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I collected so many

Laila's picture

I collected so many interesting things from your site especially its discussion. From the tons of comments on your posts, I guess I am not the only one having all the enjoyment here. Keep up the good work.
voyance serieuse par mail

Reply to comment | Linux Journal

Chauffage's picture

plafond rénover sable bétoneuse pompe à chaleur air air chauffage chauffage pompe à chaleur air eau pompe à chaleur piscine pac air
air monter parler ils la le camion pompes à chaleur devis
pompe à chaleur pompe à chaleur air air climatisation
il entreprit pompe à chaleur prix pompe à chaleur devis pompe à chaleur climatisation vous

Reply to comment | Linux Journal

climatisation réversible's picture

les travaux nous sommes climatisation portable climatisateur le soleil installateur climatisation
installateur climatisation climatiseur
gainable climatiseur pas cher entretien climatisation climatisation maison
climatiseur monobloc climatiseur gainable pierre travailler bien climatiseur
mobile sans evacuation les travaux il a dit prix clim reversible

Is SQL language something to avoid?

ΝΤΕΝΤΕΚΤΙΒ's picture

I totally agree that many developers find SQL quite hard to learn language. The usage of Views, Stored Procedures and Functions is a way to isolate a Data Model from your code / view model.

The main problem is that the effort needed in this case is higher than just embedding the SQL language inside our code.

On the other hand, most of the developers are completely anaware of these features and many times I have heard the complain of cross platform problems that may caused when using the RDBS spedific features for these powerfull abstration mechanisms: if I write Stored Procedures it will be difficult to change my RDBMS in the future - a case that never happens for many real world systems anyway.

Finally, the Stored Procedures are the most powerfull tool when you need a PRODUCTION LEVEL with HIGH PERFORMANCE system as you can minimize the round trips from Web or Application Server (that executes your business logic in a 3 tier architecture) and have total control over DB access from your application.

What is always the case in programming is that if you want to produce high quality production level (pro) products you need to work harder!

Regards,

ΝΤΕΤΕΚΤΙΒ

I find that I often want to

Anonymous's picture

I find that I often want to keep the articles in Linux Journal for later reference. There should be a link to download them as a pdf file.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix