The Python DB-API
Many people use Python because, like other scripting languages, it is a portable, platform-independent and general-purpose language that can perform the same tasks as the database-centric, proprietary 4GL tools supplied by database vendors. Like 4GL tools, Python lets you write programs that access, display and update the information in the database with minimal effort. Unlike many 4GLs, Python also gives you a variety of other capabilities, such as parsing HTML, making socket connections and encrypting data.
Possible applications for the Python DB-API include:
Many web sites construct pages on the fly to display information requested by the user, such as selections from the products offered in a catalog or from the documents available in a library. Doing this requires CGI scripts that can extract the desired information from a database and render it as HTML.
An Intranet application might use the Tkinter module and the DB-API to provide a graphical user interface for browsing through a local database, such as accounts receivable or a customer list.
Python programs can be used to analyze data by computing statistical properties of the data.
Python programs can form a testing framework for programs that modify the database, in order to verify that a particular integrity constraint is maintained.
There are lots of commercial and freeware databases available, and most of them provide Structured Query Language (SQL) for retrieving and adding information (see Resources). However, while most databases have SQL in common, the details of how to perform an SQL operation vary. The individuals who wrote the Python database modules invented their own interfaces, and the resulting proliferation of different Python modules caused problems: no two of them were exactly alike, so if you decided to switch to a new database product, you had to rewrite all the code that retrieved and inserted data.
To solve the problem, a Special Interest Group (SIG) for databases was formed. People interested in using Python for a given application form a SIG of their own: they meet on an Internet mailing list, where they discuss the topic, exchange ideas, write code (and debug it) and eventually produce a finished product. (Sounds a lot like the development process for the Linux kernel, doesn't it?)
After some discussion, the Database SIG produced a specification for a consistent interface to relational databases—the DB-API. Thanks to this specification, there's only one interface to learn. Porting code to a different database product is much simpler, often requiring the change of only a few lines.
The database modules written before the Database SIG are still around and don't match the specification—the mSQL module is the most commonly used one. These modules will eventually be rewritten to comply with the DB-API; it's just a matter of the maintainers finding the time to do it.
A relational database is made up of one or more tables. Each table is divided into columns and rows. A column contains items of a similar type, such as customer IDs or prices, and a row contains a single data item, with a value for each column. A single row is also called a tuple or a relation, which is where the term “relational database” originates.
For an example database, we'll use a small table designed to track the attendees for a series of seminars. (See Listing 1.) The Seminars table lists the seminars being offered; an example row is (1, Python Programming, 200, 15). Each row contains a unique identifying ID number (1, in this case), the seminar's title (Python Programming), its cost ($200), and the number of places still open (15). The Attendees table lists the name of each attendee, the seminar that he or she wishes to attend and whether the fee has been paid. If someone wants to attend more than one seminar, there will be more than one row with that person's name, with each row having a different seminar number and payment status.
The examples below use the soliddb module, which supports accessing SOLID databases from Python. SOLID is a product from Solidtech that was reviewed by Bradley Willson in LJ, September, 1997. I'm not trying to cover CGI or Tkinter programming, so only the commands to access the database are presented here, in the same manner as if typed directly into the Python interpreter.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
- Weechat, Irssi's Little Brother
- New Products
- Tech Tip: Really Simple HTTP Server with Python
- Poul-Henning Kamp: welcome to
1 hour 26 min ago
- This has already been done
1 hour 27 min ago
- Reply to comment | Linux Journal
2 hours 12 min ago
- Welcome to 1998
3 hours 1 min ago
- notifier shortcomings
3 hours 25 min ago
5 hours 1 min ago
- Android User
5 hours 3 min ago
- Reply to comment | Linux Journal
6 hours 56 min ago
9 hours 46 min ago
- This is a good post. This
14 hours 59 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?