At the Forge - Databases and Calendars
Last month, we continued our look at the iCalendar standard, which makes it possible for programs to exchange calendar and appointment information. As we saw, an iCalendar file contains one or more events and tasks. If we make the file available via an HTTP server such as Apache, we can distribute it to anyone with an iCalendar-compatible program, such as Mozilla's Sunbird. As we saw last month, we can go one step further than this, generating an iCalendar file dynamically, using a CGI program.
Although the programs I presented and discussed last month might be useful in a limited context, it should be clear to any Web developer that keeping the date and time information inside of a program would be foolish, to say the least.
One of the best ways to keep track of such data is in a relational database such as PostgreSQL. A relational database allows you to ensure that the data you have entered is valid and provides you with fast, flexible access to some or all of the data it contains. Moreover, by storing the calendar information inside of a database, you can create multiple versions of the same calendar file, using the same source.
This month, we look at a simple example of a Web-based program that takes calendar information from a relational database and uses it to generate an iCalendar data file, which then can be imported into iCalendar-compliant programs, such as Mozilla's Sunbird.
If we are going to store our calendar information in a relational database, we need to define at least one table. This is because everything in a relational database—often including configuration and status information—is stored in a two-dimensional table, in which the columns define individual fields, and each row contains one record. For example, here is how we might define a simple table of events in PostgreSQL:
CREATE TABLE Events ( event_id SERIAL NOT NULL, event_summary TEXT NOT NULL CHECK (event_summary <> ''), event_location TEXT NOT NULL CHECK (event_location <> ''), event_start TIMESTAMP NOT NULL, event_end TIMESTAMP NOT NULL, event_timestamp TIMESTAMP NOT NULL DEFAULT NOW(), PRIMARY KEY(event_id) );
The above table contains six columns. The first, event_id, is defined to be of type SERIAL. If we don't explicitly provide a value for event_id when adding a row to the table, PostgreSQL retrieves a new integer value automatically, up to a maximum of 231. PostgreSQL allows you to set a larger ceiling to allow the sequence to wrap around to 1, or both; see the documentation for more details.
The event_id column uniquely identifies rows in our table, and we tell the database this by marking it as a PRIMARY KEY. This not only tells other database programmers which column will be used for retrieving records, but it ensures that values are unique and that the column is indexed as well.
Another automatically populated column is event_timestamp. From the definition, it might appear as though we can (and will) set event_timestamp to an explicit value, with the current time providing a default as necessary. But whenever I define a column in this way, it implies that I never expect to set a value explicitly for this column. Rather, I am interested in letting PostgreSQL set the column's value with the current date and time.
Notice how the event_summary and event_location columns are both defined to be of type TEXT (that is, infinite-length text fields), while event_start, event_end and event_timestamp are all of type TIMESTAMP, the SQL-standard way of saying date and time.
All of the columns in this table are defined to be NOT NULL, meaning that they may not be assigned SQL's undefined value of NULL. NULL is distinct from true and false, which can make it a bit tricky for newcomers to understand. However, if you think of NULL as representing an unknown or undefined value, it might become clearer. As useful as NULLs can be in distinguishing between false and unknown values, it's usually a good idea to cut down on them as much as possible. Indeed, the advice that I have long heard, and repeated to others, is that you should define columns to be NOT NULL by default, opening them up to NULL values as the situation requires.
Finally, notice how our two text columns (event_summary and event_location) are defined both as NOT NULL and with an integrity check that ensures we enter something other than an empty string. Whether this combination of constraints is appropriate depends on your database needs. You might want to think about whether you want to allow NULL values at all and also if you want to allow empty strings for the summary and location columns.
Although this simple definition is meant to serve as an example, consider how much better it would be if we were to have a separate Locations table, with a location_id and location_name, and then replace the textual event_location column with a location_id. This would have the advantage of standardizing location names, which would lead to fewer inconsistencies. It also would allow us to search for all of the events taking place in a particular location.
Once we are done defining the table, we add some indexes to our table. Each index ensures that data will be retrieved from the table more quickly than usual, at the expense of additional time for each INSERT. Here are the definitions:
CREATE INDEX event_location_idx ON Events(event_location); CREATE INDEX event_start_idx ON Events(event_start); CREATE INDEX event_end_idx ON Events(event_end);
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Stunnel Security for Oracle
- SourceClear Open
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
- Parsing an RSS News Feed with a Bash Script
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide