SQL vs. NoSQL
So what about plain-old SQL RDBMSes? Should they be retired from active service? Are they a relic from an earlier time? Not so fast.
First and foremost, ACID transactions most definitely are required in certain use cases. Databases used by banks and stock markets, for example, always must give correct data. Where money is concerned, guessing is not allowed. It is true that no one really cares if your latest tweet takes a couple minutes to show up in your Twitter feed, but the same cannot be said for a billing system or accounting database.
Another thing in favor of RDBMSes is their use of SQL. It's a common language, and if you need to move from one database to another, you usually can get away with making only minor changes to your application, and it will “just work”. True, it may not be possible in all cases, depending on how you used or abused the SQL queries in your application, but the foundation for moving easily between different SQL databases is there, and the tools and libraries you can use to interact with your data are plentiful and robust. A unified NoSQL standard query language or API will never exist because every NoSQL database is so different.
On the NoSQL side, the only thing in common is that there is nothing in common. Each NoSQL database has its own set of APIs, libraries and preferred languages for interacting with the data they contain. With an RDBMS, it is trivial to get data out in whatever format you need using whatever programming language you like best. Your choice of a NoSQL database might limit you to one or a handful of programming languages and access methods.
Another thing RDBMSes have going for them is the relational model. The R in RDBMS traces its history back to research by E. F. Codd published in the June 1970 issue of Communications of the ACM. Since then, it has been expanded upon, improved and clarified. The relational model for databases is so popular because it is an excellent way to organize information. It maps very well to an enormous variety of real-world data storage needs, and when properly normalized, it is fast and efficient.
In the relational model, data is stored in tables with rows and columns. An address table, for example, might have columns for street name and number, city, postal code, state or province, and country. A name table might have columns for given names, family name, prefixes (Dr, Rev, Ms and so on) and suffixes (Jr, Sr, Esq and so on). Each row in the individual tables would represent an individual address or name.
The relational part (see the What Does Relational Mean in a Relational Database? sidebar) comes into play as you define which addresses relate to which names using a key. A key is a field (the intersection of a row and column) or combination of fields in a single row that is guaranteed to identify uniquely that particular row in the table it is in. For the address table, you might have a column for keys from the name table. You can use this key to look up just those addresses in the address table that “belong” (by virtue of the key) to a certain name in the name table.
My example is pretty simplistic, but when combined with ACID transactions in an RDBMS, you achieve tremendous power, flexibility and reliability. There is a reason that businesses began using them decades ago and why open-source RDBMSes dominate the Web.
And, what about the Web? The primary argument many people use against RDBMSes is that they “don't scale”, which simply isn't true. It is true that some individual RDBMSes do not scale very well or are harder to scale, but that doesn't mean every RDBMS cannot. RDBMSes are in use at every large company. The largest RDBMS installations routinely handle enormous traffic and petabytes of data.
This scaling myth is perpetuated and given credence every time popular Web sites announce that such-and-such RDBMS doesn't meet their needs, and so they are moving to NoSQL database X. The opinion of some in the RDBMS world is that many of these moves are not so much because the database they were using is deficient in some fundamental way, but because it was being used in a way for which it wasn't designed. To make an analogy, it's like people using flat-head screwdrivers to tighten Phillips-head screws, because it worked well enough to get the job done, but now they've discovered it is better to tighten Phillips screws with an actual Phillips screwdriver, and isn't it wonderful, and we should throw away all flat-head screwdrivers, because their time is past, and Phillips is the future.
One recent SQL-to-NoSQL move involved Digg.com moving from MySQL to Cassandra. As part of the move, Digg folks blogged about how they were using MySQL and why it didn't meet their needs. Others were skeptical. Dennis Forbes, in a series of posts on his site (see Resources), questioned whether Digg needed to use a NoSQL solution like Cassandra at all. His claims centered on what he considered very poor database usage on the part of Digg combined with inadequate hardware. In his mind, if Digg had just designed its database properly or switched to using SSDs in its servers, it would have had no problems. His best quote is this: “The way that many are using NoSQL is like discovering the buggy whip at the beginning of the automotive era.” Ouch.
Relational databases sometimes can be tricky to design properly. You have to know and understand your data deeply. But when they are designed properly, the performance can be orders of magnitude better compared to poorly designed databases. You also should not overlook the hardware on which your database runs. Databases love as much memory and processing power as you can throw at them, and the traditional spinning-platter disk drive has long been a limiting factor. Does the high performance of SSDs herald a new age of RDBMS performance? Many experts say yes. SSDs may be a game-changer in the database world.
Relational SQL databases have been around for several decades. They have proven reliability and performance and a feature set that meets the requirements of 99% of the use cases out there. They even make excellent key-value databases, if that's the type of data you have. There are only very few companies that can't make a relational database work for them. You may not like to hear it, but with the law of averages, chances are your company is not one of them.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- New Products
- RSS Feeds
- Tech Tip: Really Simple HTTP Server with Python
- Epistle
4 min 5 sec ago - Automatically updating Guest Additions
1 hour 12 min ago - I like your topic on android
1 hour 59 min ago - Reply to comment | Linux Journal
2 hours 20 min ago - This is the easiest tutorial
8 hours 34 min ago - Ahh, the Koolaid.
14 hours 13 min ago - git-annex assistant
20 hours 12 min ago - direct cable connection
20 hours 35 min ago - Agreed on AirDroid. With my
20 hours 45 min ago - I just learned this
20 hours 49 min ago




Comments
Yap what more can you say
Yes this article exactly matches the situation today with SQL/NoSQL movements.
Very good analysis.