Open-Source Databases, Part III: Choosing a Database
For years, one of the claims made in the MySQL/PostgreSQL flame war has had to do with speed. MySQL fans often have claimed that their system is faster, particularly for read-only tasks, making it a superior choice for Web sites where most data is read. PostgreSQL advocates, in contrast, claim that their system holds up to big loads much better than MySQL.
I haven't conducted any benchmarks of my own, but my reluctance to do so is an admission that I'm unqualified to create a good benchmark, and not that I believe the two systems are identical or that performance isn't important. Moreover, as I stated previously, I believe that performance is secondary to data integrity. I would much rather have a slow, reliable database than a fast one that occasionally will wreak havoc on my data.
From the benchmarks I've seen, it appears that MySQL is indeed faster than PostgreSQL when working with a small number of clients or with read-only data. However, all of the comparisons I've seen over the last few years indicate that as more clients are added to the system, PostgreSQL handles the load better.
Does this mean that PostgreSQL always will be faster? Of course not. But, it does mean that on particularly popular sites, PostgreSQL may hold up better.
Maybe I'm simply naive, but I decided several years ago that I would largely ignore the performance debate when it came to databases. Both MySQL and PostgreSQL have large followings and have been used on large-scale systems. The data seems to indicate that PostgreSQL has an advantage, but enough people are using MySQL on large Web sites that I have to assume it is working well enough for them.
Finally, no comparison would be complete without mentioning support. We might consider several types of support—from the strength of the Open Source community to the number and quality of companies supporting (and developing) the software to the number of third-party applications that support each database.
It is impossible to ignore the extremely large number of MySQL users in the world. This has led to an outpouring of books, tutorials and mailing lists for MySQL—some (but not all) of which have been sponsored by the MySQL company itself. If the community-based support is not enough, it is possible to buy commercial support for MySQL from a number of companies, including MySQL AB.
PostgreSQL has a smaller community, and a smaller number of books and tutorials available. However, my experience has been that the community is responsive to questions and suggestions, and that the lead developers often are quite willing to answer questions from all levels of users.
Many open-source packages support both MySQL and PostgreSQL. But, it is rare to find a package that supports PostgreSQL exclusively, and it is easy to find packages that support MySQL alone. This has been a source of some frustration for members of the PostgreSQL community; however, there doesn't seem to be much anyone can do about it, short of asking for patches or contributing such patches.
A recent thread on the main PostgreSQL mailing list asked about CRM packages that support the database. Although there were a few, there was definitely some grumbling about the lack of PostgreSQL from other open-source projects. Those projects often are staffed by small groups of volunteers who rarely understand how they can make their SQL more portable and thus easier to use on multiple brands of databases.
The bottom line on support is that although PostgreSQL support is excellent, MySQL support is overwhelming. If there is a winner here, it's MySQL.
So, should you pick MySQL or PostgreSQL for your next database task? All things being equal, I strongly recommend PostgreSQL. Its community might be smaller, and there are fewer resources available in print and on the Web. But, it has more features to ensure data integrity, its features are largely a superset of MySQL, and it always offers transactions and referential integrity, without having to specify a particular type of table.
That said, there are reasons to use MySQL: if you already are using it, if you need commercial or community support, if you need replication, or if you are using software that is incompatible with PostgreSQL, MySQL is a fine choice. Just make sure to use InnoDB tables, so that you can take advantage of what a database always was meant to do—ensure the quality of the data.
A table comparing administration and programming of the two databases is available at linuxboxadmin.com/articles/postgresql-for-mysql-users.php.
A relatively recent comparison of the two databases' performance is at www.mysqlperformanceblog.com/2006/11/30/interesting-mysql-and-postgresql-benchmarks, which points to the following: tweakers.net/reviews/657.
Finally, a comparison between the databases (but perhaps a bit out of date), along with Oracle, was conducted at CERN, the European center for particle physics, and is available at dcdbappl1.cern.ch:8080/dcdb/archive/ttraczyk/db_compare/db_compare.html.
Reuven M. Lerner, a longtime Web/database consultant, is a PhD candidate in Learning Sciences at Northwestern University in Evanston, Illinois. He currently lives with his wife and three children in Skokie, Illinois. You can read his Weblog at altneuland.lerner.co.il.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Designing Electronics with Linux
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Build a Skype Server for Your Home Phone System
- Validate an E-Mail Address with PHP, the Right Way
- Why Python?
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
3 hours 3 min ago
- Reply to comment | Linux Journal
3 hours 11 min ago
- Understanding the Linux Kernel
5 hours 25 min ago
7 hours 55 min ago
- Kernel Problem
17 hours 58 min ago
- BASH script to log IPs on public web server
22 hours 25 min ago
1 day 2 hours ago
- Reply to comment | Linux Journal
1 day 2 hours ago
- All the articles you talked
1 day 4 hours ago
- All the articles you talked
1 day 5 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?