Information Management for the Desktop

by Tom Poe

It's been tough being a newbie for these past several years. I've managed to stick with it and believe I've set a good example for being a sustainable newbie overall. The nice thing about the open-source world is that there is no stigma attached to being stuck on square one for extended periods of time. Unlike public school, where you are subjected to unbearable ridicule if you fall behind, in the Open Source community, there is no "behind", only "forward", "onward" and "upward". That's a good thing.

I have a PostgreSQL database on my home computer, and I must say that it's a most impressive application. I managed to combine that with another application, called pgaccess. Between the command-line interface and pgaccess, I enjoy adding information to the little databases I've created. Sometimes, I'm rewarded with having useful information to look at when I search the database with SQL queries. I've wondered from time to time, however, how I might use the database to centralize all the information on my system. For example, I have some useful information floating around in e-mail archives, never to be read again. I suppose I could learn the search/grep commands that would let me revisit old e-mails, but that doesn't seem productive to me. Now, if the old e-mails with useful information could be transferred to my database, I think that would be nifty. The following, then, is a description of what I came up with that seems simple and easy and might even be useful.

Picking the Language

On my system, which runs the SuSE 7.1 Pro Edition and the Linux2.4 Kernel, I can do a lot. The terminals let me choose which shell language I want, the applications let me choose which programming language I want, and I can mix and match to my heart's content. For this little project, though, I decided to use Perl. My e-mails are text-based, and there's a site, www.cpan.org, that contains untold numbers of scripts, modules and distributions of applications for untold numbers of tasks. It's overwhelming, to say the least. Since Perl is the undisputed "king-of-the-hill" for text-based file manipulation, it is the selected programming language to use.

The Perl documentation is thorough, readable and, except for that regex thing, within the grasp of a lot of folks. Even better, my system has an application called, CPAN. I simply bring up a terminal, type in the command $> Perl -MCPAN -e shell and up pops the prompt, waiting for me to type in what I need. When I do that, it goes out and finds the latest version of what I need, then downloads and installs it. Nice.

So using Perl, I set out to write a simple script that prepares an e-mail message to be added to my PostgreSQL database.

The Script

This project, remember, is not a production tool at this stage. It's designed to let a home user read her e-mail messages and, when the mood strikes, copy and paste the message to a text file. With the e-mail message in a text file, the home user can then run the Perl script, like this:

$> Perl e-mailparse.pl

The e-mail message has two parts: the header lines (FROM, TO, CC, BCC, SUBJECT) and the message body. On my system, I cut and paste an e-mail message to a text file, and it has five lines for the header, followed by two blank lines and then the message body. When I cut and paste to a text file, the e-mail looks like this:

Date: Fri, 10 May 2002 14:40:04 -0700
From: joe@joeisp.com
To: tompoe@renonevada.net
Cc: sam@samisp.com, julie@julieisp.com
Subject: Re: Blue Coat Linux Fixer
    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]
 
Tom,
 
I am the - - - - rest of message. Signed, Joe

I didn't have any use for keeping the date and time, as that was already set up in my little database application. I didn't need the other lines in the header either, as I already set up the database columns to reflect who and what and so on. So I need a simple script to gather the message body and throw away the header information. The script, then, looks like this:

#! /usr/local/bin/Perl -w
# Simple program to prepare an e-mail message for entry into a database
# This program simply discards header information, and collects message
# body into an array.  Once in the array it should print out just the
# message body.
use strict;
my $file = '../PerlStuff/e-mail.txt';
my $outfile = '../PerlStuff/e-mailtotal.txt';
open (IN, "<$file") or die "Can't open $file: !$\n";
open (OUT, ">>$outfile") or die "Can't open $outfile: !$\n";
while (<IN>) {
    if (6 .. eof){ print OUT; }
}
close IN;
close OUT;

The first line is called the "shebang" line. It is a unique format that identifies that this file is a Perl script and tells the computer where to find the Perl installation. The -w at the end of the first line is important--in order to alert the user to problems with the script--and stands for "warnings". The next four lines start with a number sign, signalling a comment that is to be ignored. The next line, line 7, is important, as it raises the level of grammatical correctness for the script to a higher level than otherwise. We can think about it this way: when you write HTML code for a web page, many browsers are forgiving and try to figure out what you wanted to do if you make a mistake and don't place the code properly. Perl will do the same thing. However, we should be as careful as possible, so using use strict; is a way to have Perl check carefully for errors in syntax. It looks for declarations of variables, which we have entered on lines 8-9. The script then opens the text file where we cut and pasted our e-mail. It opens the file where we want to append the information, in preparation for inserting into our little database application, gets rid of the first five lines, gathers the rest (which is the part we want) and appends it to the e-mailtotal.txt file.

To use this script, you need to create a text file for your e-mail cut and paste step. When you do that, bring up the script file and replace my path to e-mail.txt with the path to your own file.

Next, you need to create a destination file that will receive the message body. When you do that, bring up the script file again and replace my path for e-mailtotal.txt with the path to your own file.

Hopefully you counted the number of lines from the top of your e-mail.txt file to where the first blank line starts. That number plus one will be entered in place of my 6 in the script. You'll find that in line 14 of the script.

We haven't talked about installing whatever database you might have on your system, and we haven't talked about installing Perl on your system. If you need help with this, send me an e-mail describing what your system looks like, and we'll try to help you get set up to use this script.

Conclusion

If you have a database on your computer, receive e-mails that have useful information from time to time and want to centralize your desktop information management, this script is a good starting point.

Tom Poe is currently involved with Open Studios, a nonprofit organization dedicated to broadening the base of works in our precious public domain. For further information about Open Studios and how to contact Tom, visit www.studioforrecording.org

Load Disqus comments