Some months ago I had a discussion with NoSQL creator, Carlo Strozzi, regarding the databases.
I should admit, I am an SQL fan! It's hot having the same language, no matter which platform or database engine is used. He underlined the fact that most SQL engines lack of flexibility and waste system resources (memory and disk space) because of their multi-platform environment (such as Oracle, DB2, Informix, etc.).
He suggested I have a look at the white paper that inspired him: “The UNIX Shell As a Fourth Generation Language” by Evan Schaffer (firstname.lastname@example.org) and Mike Wolf (email@example.com).
Quoting from the above paper:
... almost all [database systems] are software prisons that you must get into and leave the power of UNIX behind. [...] The resulting database systems are large, complex programs which degrade total system performance, especially when they are run in a multi-user environment. [...] UNIX provides hundreds of programs that can be piped together to easily perform almost any function imaginable. Nothing comes close to providing the functions that come standard with UNIX.
The UNIX file structure is the fastest and most readily-available database engine ever built: directories may be viewed as catalogs and tables as plain ASCII files. Commands are common UNIX utilities, such as grep, sed and awk. Nothing should be reinvented.
NoSQL was born with these ideas in mind: getting the most from the UNIX system, using some commands that glue together various standard tools. Although NoSQL is a good database system, this is not a panacea for all you problems. If you have to deal with a 10 gigabytes table that must be updated each second from various clients, NoSQL doesn't work for you since it lacks of performance on very big tables, and on frequent updates you must be in real time. For this case, I suggest you use a stronger solution based on Oracle, DB2 or such packages on a Linux cluster, AS/400 or mainframes.
However, if you have a web site containing much information and more reading occurs than writing, you will be surprised how fast is it. NoSQL (pronounced noseequel, as the author suggests) derives most of its code from the RDB database developed at RAND Organization, but more commands have been built in order to accomplish more tasks.
The latest NoSQL source code can be found at ftp://ftp.linux.it/pub/database/NoSQL, but RPM and Debian packages are also available. At the time of writing, latest stable version is 2.1.3.
Just unpack the source using the command tar -xzvvf nosql-2.1.3.tar.gz in a convenient directory (such as $HOME/src), and you will get all the code in the nosql-2.1.3 subdirectory. Enter the above subdirectory and do the following:
./configure make make install
The software will put its engine into /usr/local/lib/nosql, its documentation in /usr/local/doc/nosql and a symlink /usr/local/bin/nosql that points to the real executable (/usr/local/lib/nosql/sh/nosql). You can change the directory prefix (e.g., /usr instead of /usr/local) invoking ./configure --prefix=/usr.
You should copy the sample configuration file to the $NSQLIB directory (i.e., /usr/local/lib/nosql). This is not required but it's useful for changing some parameters via configuration file instead of variables. The commands
cp nosql.conf.sample /usr/local/lib/nosql/nosql.conf chmod 0664 /usr/local/lib/nosql/nosql.conf
will copy it with the right permissions. You can optionally have a personal NoSQL configuration file creating an $HOME/.nosql.conf with 0664 permission applied.
Although NoSQL installation is quite simple, I suggest you to read the INSTALL file: the author gives some good tips.
Now that the package has been installed, let's start getting acquainted with NoSQL commands through an example.
We are the usual Acme Tools Inc. which supplies stuffs to the Toonies land. We would like to track our customers, so we should create a first table in which we will list some customers details (such as code, phone, fax, e-mail, etc...). The best way to create tables from scratch is via template files.
A template file contains the column names of the table and associated optional comments separated by tabs and/or spaces. Comments may be also specified with the usual hash sign (#) as the first character of the line. The file below, customer.tpl, is our template for the customer table.
# Acme Tools, Inc. # Customers table ####################### CODE Code number NAME Name/Surname PHONE Phone no EMAIL E-mail
Most of the NoSQL commands read from STDIN and write to STDOUT. The maketable command, which builds tables from templates, is one of these. Issuing the command
nosql maketable < customer.tplwe'll get the table header on STDOUT:
CODE NAME PHONE EMAIL ---- ---- ----- -----Great, but we should keep it in a file. We could simply redirect the command output to a file, e.g.,
nosql maketable < customer.tpl > customer.rdbbut this wouldn't be the right way. The write command may be helpful in this case, as it reads a table from STDIN (a simple header in this case) and writes a file, checking data integrity.
The resulting command would be
nosql maketable < customer.tpl | nosql write -s customer.rdb.
The -s switch in the write operator suppress STDOUT, e.g., nosql write, similar to the tee UNIX utility, writes both to file and STDOUT, unless -s is specified.
Pay attention, because the write command does not do any locking on the output table: nosql lock table and nosql unlock table must be used for this purpose.
|Speed Up Your Web Site with Varnish||Jun 19, 2013|
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
- Speed Up Your Web Site with Varnish
- Containers—Not Virtual Machines—Are the Future Cloud
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Non-Linux FOSS: libnotify, OS X Style
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Reply to comment | Linux Journal
23 min 50 sec ago
- Reply to comment | Linux Journal
4 hours 23 min ago
- Yeah, user namespaces are
5 hours 39 min ago
- Cari Uang
9 hours 11 min ago
- user namespaces
12 hours 4 min ago
12 hours 30 min ago
- One advantage with VMs
14 hours 59 min ago
- about info
15 hours 32 min ago
15 hours 33 min ago
15 hours 34 min ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?