PostgreSQL—the Linux under the Databases
One milestone in the spreading of Linux for general use is the availability of office applications. Besides text-processing, spreadsheets and graphic applications, databases are an important part of this type of application. Until now, the big players in the database market like Sybase and Oracle haven't released a Linux version of their products; nevertheless, there are several database packages available for Linux, mostly commercial applications. There is one database which deserves special attention, because it is now developed similarly to Linux. This database is PostgreSQL, formerly known as Postgres95. Last October, version 6.2.1 was published, and this is a good time to take a closer look at this project.
PostgreSQL has a long history beginning in the year 1986 when Michael Stonebraker began its development as a successor to the Ingres database system. The main goal of his project was to show that a relational database system could cope with modern demands of extensibility as well as an object-oriented system. The resulting product was called an object-relational database because it was a relational database system with some object-oriented features such as inheritance and user-defined functions, operators and access methods. In 1994, with the release of version 4.2, the work on Postgres stopped because Stonebraker's project ended. However, some staff members decided to continue with their work and in 1995 released Postgres95 with SQL as the query language rather than PostQuel. Now the development is carried on under the name PostgreSQL by a team of volunteers on the Internet, coordinated by Marc G. Fournier.
The package is not released under the GPL. It has its own license that allows the distribution of modified releases even without source code, as long as the copyright notice remains untouched.
Client/Server Database for a multi-user environment
Networking with TCP/IP
Three Authentication methods: Kerberos, host/user based and username/password authentication
SQL as query language (It is not fully ANSI SQL92 compliant; options not available are nested subqueries, HAVING clause in an ORDER BY statement, OUTER JOIN, PRIMARY and FOREIGN KEY statements during table creation.)
Multiple index types, unique indexes and multi-column indices
User-defined functions (SQL, C), operators and data types
User-defined sequences and trigger functions
Language interfaces for C, C++, Objective-C, Java, Perl, Tcl/Tk and Python
Available third party ODBC driver
Ported for Linux/Intel, Linux/Solaris, Linux/Alpha, AIX, DEC Alpha AXP, FreeBSD, BSDI, DG/UX, HP-UX, Nextstep, Solaris x86, Solaris Sparc, SunOS Sparc, SGI Irix, SCO, Intel SVR4, Ultrix
Client/Server databases are different from the monolithic systems familiar to users of MS-DOS and Windows machines. It is not sufficient to start only one program to work with the database. There are at least three different applications running simultaneously. One is the server, which in PostgreSQL is called the postmaster. It is a daemon which observes a TCP port (usually port 5432) for a connection from the second: a user client. This user client could be psql, a command-line tool for sending SQL queries that comes with PostgreSQL, or specialized applications, like a CGI program that wishes to store data gathered from an HTML form. Once a client connects, the postmaster starts the third application: the back-end server. This is the actual database engine, because only the back-end server has direct access to the stored records of the database. The postmaster connects the client with the back-end server and then waits for other connections. In the meantime, the client sends its queries to the back-end and receives its reply, usually a lot of data. When the client has no other query, the connection drops and the back-end server exits.
The consequence of this procedure is that the client, the program that the user works with, has no direct access to the database. It doesn't know how many records the back end sends in response to a query. It doesn't know if the inserted data were successfully processed, unless the back end signals success. Therefore, fancy table editors such as those in dBase or Access are difficult to implement and don't currently exist.
The installation of PostgreSQL is a little tedious at present, but it becomes easier with every release. Version 6.2.1 comes with configure-support and detailed and comprehensive installation instructions with specific notes for Linux. I will mention some possible pitfalls during the installation process. Make sure you have a recent version of flex installed on your system. In particular v2.5.3 does not work with PostgreSQL whereas v2.5.2 and v2.5.4 work fine. For the client psql, you need the developer libraries from GNU Readline, which are not installed by default on some Linux distributions.
On Linux, PostgreSQL is compiled with a shared library named libpq.so, which is used by all client applications. You need to make sure that the dynamic linker finds this library by adding the directory $POSTGRESDIR/lib to the file ld.so.config. Especially with older versions of ld-linux.so (e.g., v1.8.5), it is necessary to also add this directory to the environment variable LD_LIBRARY_PATH. I do this in my rc.local file, which starts up the postmaster, so it looks like this:
echo "starting PostgreSQL postmaster..." export LD_LIBRARY_PATH=/opt/lib/pgsql/lib su postgres -c "postmaster -D/opt/lib/pgsql/data \ -d 1 &> /opt/lib/pgsql/postmaster.log &"
In the contrib/linux directory of the source distribution, a more sophisticated startup script named postgres.init can be found, that integrates the postmaster startup into the RedHat runlevel system.
Before starting the postmaster, edit the file pg_hba.conf in the data directory. This file holds information, indicating which hosts and users are allowed to connect to the database back end. The minimal file necessary for local operation is:
# TYPE DATABASE IP_ADDRESS MASK USERAUTH MAP host all 127.0.0.1 255.255.255.255 trust # The above allows any user on the local system to # connect to any database under any user name.
As mentioned in the documentation, it is essential not to start the postmaster as root, but rather to start it in a special postgres user ID.
For clients to work, they must know where to find the databases. Therefore, you have to set the environment variables PGDATA and PGLIB for every user. I do it in /etc/profile. Additionally, every user who should have access to PostgreSQL needs to have appropriate access permissions in the system databases. This is done with the createuser program, run by the postgres account. createuser offers you the option of giving the user the right to create databases and/or the right to add other users. It is possible to say no to both questions, thus giving the user access only to the existing databases. If you are running PostgreSQL as a WWW database back end, you must set access permissions for the user ID of the web server.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- Tech Tip: Really Simple HTTP Server with Python
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- Managing Linux Using Puppet
- Doing for User Space What We Did for Kernel Space
- Parsing an RSS News Feed with a Bash Script
- Google's SwiftShader Released
- SuperTuxKart 0.9.2 Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide