PostgreSQL, the NoSQL Database

JSON and JSONB

It has long been possible to store JSON inside PostgreSQL. After all, JSON is just a textual representation of JavaScript objects ("JavaScript Object Notation"), which means that they are effectively strings. But of course, when you store data in PostgreSQL, you would like a bit more than that. You want to ensure that stored data is valid, as well as use PostgreSQL's operators to retrieve and work on that data.

PostgreSQL has had a JSON data type for several years. The data type started as a simple textual representation of JSON, which would check for valid contents, but not much more than that. The 9.3 release of PostgreSQL allowed you to use a larger number of operators on your JSON columns, making it possible to retrieve particular parts of the data with relative ease.

However, the storage and retrieval of JSON data was never that efficient, and the JSON-related operators were particularly bad on this front. So yes, you could look for a particular name or value within a JSON column, but it might take a while.

That has changed with 9.4, with the introduction of the JSONB data type, which stores JSON data in binary form, such that it is both more compact and more efficient than the textual form. Moreover, the same GIN and GIST indexes that now are able to work so well with HStore data also are able to work well, and quickly, with JSONB data. So you can search for and retrieve text from JSONB documents as easily (or more) as would have been the case with a document database, such as MongoDB.

I already have started to use JSONB in some of my work. For example, one of the projects I'm working on contacts a remote server via an API. The server returns its response in JSON, containing a large number of name-value pairs, some of them nested. (I should note that using a beta version of PostgreSQL, or any other infrastructural technology, is only a good idea if you first get the client's approval, and explain the risks and benefits.)

Now, I'm a big fan of normalized data. And I'm not a huge fan of storing JSON in the database. But rather than start to guess what data I will and won't need in the future, I decided to store everything in a JSONB column for now. If and when I know precisely what I'll need, I will normalize the data to a greater degree.

Actually, that's not entirely true. I knew from the start that I would need two different values from the response I was receiving. But because I was storing the data in JSONB, I figured it would make sense for me simply to retrieve the data from the JSONB column.

Having stored the data there, I then could retrieve data from the JSON column:


SELECT id, email,
       personal_data->>'surname' AS surname
       personal_data->>'forename' as given_name
  FROM ID_Checks
 WHERE personal_data->>'surname' ilike '%lerner%';

Using the double-arrow operator (->>), I was able to retrieve the value of a JSON object by using its key. Note that if you use a single arrow (->), you'll get an object back, which is quite possibly not what you want. I've found that the text portion is really what interests me most of the time.

Conclusion

People use NoSQL databases for several reasons. One is the impedance mismatch between objects and tables. But two other common reasons are performance and convenience. It turns out that modern versions of PostgreSQL offer excellent performance, thanks to improved data types and indexes. But they also offer a great deal of convenience, letting you set, retrieve and delete JSON and key-value data easily, efficiently and naturally.

I'm not going to dismiss the entire NoSQL movement out of hand. But I will say that the next time you're thinking of using a NoSQL database, consider using one that can already fulfill all of your needs, and which you might well be using already—PostgreSQL.

Resources

Blog postings about improvements to PostgreSQL's GiN and GIST indexes, which affect the JSON and HStore types:

PostgreSQL documentation is at http://postgresql.org/docs, and it includes several sections for each of HStore and JSONB.

______________________

Reuven M. Lerner, Linux Journal Senior Columnist, a longtime Web developer, consultant and trainer, is completing his PhD in learning sciences at Northwestern University.