At the Forge - Database Modeling with Django
The past few months, this column has been examining Django, a popular open-source Web development framework written in Python. Django sometimes is described as a rival to Ruby on Rails or a Python version of Rails, but just as Python and Ruby are distinct languages, each with its own strengths and weaknesses, Django and Rails are different frameworks, and each has its own set of trade-offs.
If you have been following this series of columns about Django, you already have seen how to download and install the Django software, how to create and configure a site and application, and even how to create views (Python methods that handle the business logic) and templates (HTML files with special rules for interpolating variables and dynamic content). With everything we've looked at so far, you could presumably create an interesting dynamic Web application.
However, most modern Web applications have another component, a relational database, on which they rely for data storage and retrieval. Sure, you could store everything on the filesystem or even in memory, but for most of us, a relational database is the path of least resistance, ensuring the safety of our data while providing a great deal of flexibility in retrieving it.
This month's column, then, looks at the ways in which Django programmers can store and retrieve information in a database. If you have worked with databases only from PHP or CGI programs, you will be surprised and impressed by the degree of automation Django provides. If you have worked with Ruby on Rails, you probably will think the Django programmers are working too hard—to which Django hackers would say that they want to have full control over their application, rather than rely on behind-the-scenes magic.
The term model in the Django world describes a Python object for which there is a persistent state, presumably stored in a relational database. We don't need to use models to integrate a database into Django, but it would be difficult (not to mention unaesthetic) if we were simply to stick SQL queries into our templates. So instead, we use Django's built-in object-relational mapper, working solely with objects from within our views and templates. The mapper's job is to translate our method calls into SQL and then translate the resulting database response into Python objects.
But, before we even can create our model object, we first must have a database table to which the object will connect. Django requires that we define our model using Python code, describing the table's name, fields and even some default values.
If we were interested in keeping the blog application we started last month, we probably could define our table in PostgreSQL as follows:
CREATE TABLE Posting ( id SERIAL NOT NULL, title TEXT NOT NULL, body TEXT NOT NULL, posted_at TIMESTAMP NOT NULL DEFAULT NOW(), PRIMARY KEY(id) );
But in Django, we don't create the above SQL directly. Rather, we use Python to create it for us. For example, we can define the above table in Django as follows:
from django.db import models class Posting(models.Model): title = models.CharField(maxlength=30) body = models.TextField() publication_date = models.DateTimeField()
Our model is a Python class, which inherits from django.db.models.Model. We define each field with a particular type, using objects that we imported from django.db.models. As shown above, some of these data types can be restricted or modified from their defaults by passing parameters. Some are defined specifically because they have built-in limits, such as EmailField, which must be a valid e-mail address. By defining columns with ManyToManyField and ForeignKey objects, it's possible to define a variety of relationships among tables.
The above code should be placed in models.py, a Python file that sits within our application's directory (blog in this case), which itself sits inside our Django site directory (mysite in this case). Thus, the models for my blog application reside in mysite/blog/models.py, whereas the models for a poll application would reside in mysite/poll/models.py.
Notice that we don't have to define a primary key, which traditionally is called id and is a nonrepeating integer. (In PostgreSQL, we set it to have a SERIAL data type, which gives the column a default value taken from a newly created sequence object. In MySQL, you would set the column to AUTO_INCREMENT, which has some of the same capabilities as a sequence.) Django creates the id column for us automatically. Django handles potential namespace conflicts by prefacing the table name with the application name. So, the posting table within the blog application becomes the blog_posting table.
Now, how do we turn our Python code into SQL? First, we have to be sure Django knows which database to use. If you have been following along since my first Django article in the July 2007 issue, you already have added the appropriate lines to settings.py, a site-wide configuration file in which we define the database type, name, user and password. Here are the values that I have installed:
DATABASE_ENGINE = 'postgresql' DATABASE_NAME = 'atf' DATABASE_USER = 'reuven' DATABASE_PASSWORD = '' DATABASE_HOST = '' DATABASE_PORT = '5433'
It's also important to check that the application is defined in INSTALLED_APPS, a tuple of strings. On my system, INSTALLED_APPS looks like this:
INSTALLED_APPS = ( 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.sites', 'django.contrib.admin', 'mysite.blog' )
Notice the clear namespace distinction between my application (mysite.blog) and the applications that are included with Django (django.contrib.*).
Before we turn our Python code into SQL, we first should check to make sure it passes some basic sanity and validation checks. To do that, we go to our site's home directory, and type:
python manage.py validate
If all goes well, Django will report that there aren't any errors. Now that our model has been validated, we can use it to create SQL. The easiest way to do this is with the sqlall command to manage.py:
python manage.py sqlall blog
This produces the SQL output for our database driver (PostgreSQL, in this case). For example, this is the output that I see on my system:
BEGIN; CREATE TABLE "blog_posting" ( "id" serial NOT NULL PRIMARY KEY, "title" varchar(30) NOT NULL, "body" text NOT NULL, "publication_date" timestamp with time zone NOT NULL ); COMMIT;
To their credit, the Django developers wrap the CREATE TABLE statement between BEGIN and COMMIT, ensuring that the table creation will take place in a transaction and will be rolled back if there is a problem. This isn't an issue when creating only one table, but if we have several models, it's always best to leave the database in a consistent state.
One way for us to use the output from sqlall to create tables is to copy it from the terminal window and then paste it, either into a file or into the psql client program. But, Django provides the syncdb utility to do this for us:
python manage.py syncdb
The output from this reassures us that all is well:
Creating table blog_posting Loading 'initial_data' fixtures... No fixtures found.
And, sure enough, now we can see that our table has been added:
atf=# \d blog_posting id | integer | not null default nextval('blog_posting_id_seq'::regclass) title | character varying(30) | not null body | text | not null publication_date | timestamp with time zone | not null Indexes: "blog_posting_pkey" PRIMARY KEY, btree (id)
Voilà! We now have a model that we can access via Python methods, but that exists in our relational database.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Reply to comment | Linux Journal
3 hours 10 min ago
- Reply to comment | Linux Journal
3 hours 26 min ago
- Favorite (and easily brute-forced) pw's
5 hours 18 min ago
- Have you tried Boxen? It's a
11 hours 10 min ago
- seo services in india
15 hours 41 min ago
- For KDE install kio-mtp
15 hours 42 min ago
- Evernote is much more...
17 hours 42 min ago
- Reply to comment | Linux Journal
1 day 2 hours ago
- Dynamic DNS
1 day 3 hours ago
- Reply to comment | Linux Journal
1 day 4 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?