Command-Line Cloud: rss2email

In my last article, I started a series called Command-Line Cloud. The intent of the series is to discuss how to use the cloud services we are faced with these days without resorting to a Web browser. I spend most of my time on the command line, so that's where I'd most like to interface with cloud services. My last article described how to use Google Calendar from the command line, and in this article, I talk about a more general cloud service—RSS feeds. If I had written this column a few months ago, it would have been more focused on replacing Google Reader itself, because that was the primary RSS aggregator I used, but Google preemptively killed off the service and left a lot of users, including myself, scrambling to find a replacement. Although a number of people were able to find some sort of Web-based replacement, I realized the main features I wanted (sorting stories by date and vi key bindings to view the next story) were absent in a lot of the existing Google Reader replacements. What's worse, a lot of people were using this as an opportunity to make a quick buck by selling access to RSS services (and of course, still capturing everyone's valuable Web-viewing habits).

I decided to take a completely different tack and convert my RSS feeds to e-mail in a special mailbox and use an interface I already was used to: e-mail on the command line using mutt. I decided to use the rss2email program, written by the great Aaron Swartz, to manage my feeds. This software pulls down RSS feeds and converts each story into its own e-mail message that it sends to you. This means you can use whatever e-mail program you want to read your feeds, but of course, because we are focusing on the command line here, I am going to talk about only mutt.

Installation and Configuration

The rss2email program already had Debian packages, so on my system, installing it was as easy as typing apt-get install rss2email. If for some reason it isn't packaged for your distribution, follow the steps on http://www.allthingsrss.com/rss2email/getting-started-with-rss2email to download and extract the rss2email tarball. This is Python software, so you will need Python 2.x on the system as well as some sort of local Sendmail program (Postfix or Exim works as well), or alternatively, you'll need to identify an outbound mail server you can use to send these e-mail messages.

Once rss2email is installed, you interface with it via the r2e command. To set up a new rss2email database containing your feeds, type:


$ r2e new youremail@yourdomain.net

Note that the e-mail address you use here will be the e-mail to which rss2email will send the e-mail messages. Once the database is set up, it's time to add feeds to it. You can do that with:


$ r2e add http://feed.someurl.com/rss

Or, if you happen to have an OPML file (such as you may have exported from Google Reader when you jumped ship), you can import that:


$ r2e opmlimport file.opml

At any point, you also can export all of your configured feeds from rss2email as an OPML file:


$ r2e opmlexport

Once you have added some feeds, you will want to poll them for new stories. Now the first time you run r2e against these feeds, it will pull in all stories in the feed, which probably includes some you already have seen. If you want to avoid that, the first time you will want to run:


$ r2e run --no-send

Otherwise, run:


$ r2e run

The first time it may take a while, because it reads all of your feeds and generates e-mail. Of course, by default, it will send all the stories to your INBOX, so because I control my own mail server, I created a special e-mail address for rss2email to use and then set up a procmail rule so that it forwarded all e-mail messages sent to that address to a special rss mailbox.

Of course, rss2email updates your feeds only when you run the command, so you probably will want to run this within cron so it updates automatically. Just run crontab -e as your regular user and add:


* *     * * *    r2e run 2>/dev/null

This will run r2e every minute and output any random error (such as when feeds are temporarily down) to /dev/null instead of sending you e-mail every time. For the most part, rss2email works as is, but in my case, I wanted to change two extra settings. To do this, just open up ~/.rss2email/config.py in a text editor and add the following settings:


HTML_MAIL = 1
DATE_HEADER = 1

The first setting tells rss2email to send the e-mail as HTML mail, and the second dates the message based on the date of the news story, not the date it picked it up. Although you might be surprised that I opt for HTML e-mail in my text-based e-mail client, mutt automatically converts HTML e-mail to text for me. Plus, when I tell mutt to open the e-mail in an external viewer, on pure shell sessions, it means I can view the full article in a text-based Web browser, such as w3m, very easily. And, when I run mutt on a machine with a Web browser, it can open the full article there instead.

Managing Feeds

Managing feeds in rss2email is relatively straightforward. First type r2e list to see a numbered list of all of your feeds. You will use the number associated with a feed to manage it. For instance, to get rid of a feed numbered 12, you would type:


$ r2e delete 12

You also can pause feeds if you want to ignore them temporarily with r2e pause number and then unpause it with r2e unpause number.

Mutt as the Front End

What makes this set up work so well for me is that I can use my regular mail program, mutt, to view my feeds. This means I quickly can scan my feeds and skip uninteresting or duplicate stories. In my case, I found I needed to tweak how mutt displays the index for this mailbox so I more easily could see who the feed was from. By default, mutt sets the index_format to:


set index_format="%4C %Z %{%b %d} %-15.15L (%4l) %s"

So, I set up a folder-hook so that when I'm in the rss mailbox, I get a slightly tweaked index:


folder-hook rss 'set index_format="%4C %Z %{%b %d} %-20.20f  %s"'

The main change I made was to remove the column that displayed the size of the message (%4l), and had it display the complete FROM: line of the e-mail and gave myself a little extra room in that column (%-20.20f). The result is a much more readable feed list, as you can see in Figure 1.

Figure 1. RSS Feeds inside Mutt

Although it's true that I miss out on images within my news feeds, in many cases, that just means I miss out on ads or clipart. When I run mutt from a machine with a Web browser and view the HTML, it opens the e-mail inside the Web browser, and from there, I can view images if I want. For those feeds that post only a summary of the article, I can follow the hyperlinks to the main article and read it in full.

Viewing my news this way may not appear to be as full-featured as with using a Web-based client, but it's a lot more flexible. Plus, in my case, I often have hundreds of stories to pore through, so viewing just the text versions of stories helps me focus on what's most important—the data.

______________________

Kyle Rankin is a director of engineering operations in the San Francisco Bay Area, the author of a number of books including DevOps Troubleshooting and The Official Ubuntu Server Book, and is a columnist for Linux Journal.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix