Command-Line Cloud: rss2email

In my last article, I started a series called Command-Line Cloud. The intent of the series is to discuss how to use the cloud services we are faced with these days without resorting to a Web browser. I spend most of my time on the command line, so that's where I'd most like to interface with cloud services. My last article described how to use Google Calendar from the command line, and in this article, I talk about a more general cloud service—RSS feeds. If I had written this column a few months ago, it would have been more focused on replacing Google Reader itself, because that was the primary RSS aggregator I used, but Google preemptively killed off the service and left a lot of users, including myself, scrambling to find a replacement. Although a number of people were able to find some sort of Web-based replacement, I realized the main features I wanted (sorting stories by date and vi key bindings to view the next story) were absent in a lot of the existing Google Reader replacements. What's worse, a lot of people were using this as an opportunity to make a quick buck by selling access to RSS services (and of course, still capturing everyone's valuable Web-viewing habits).

I decided to take a completely different tack and convert my RSS feeds to e-mail in a special mailbox and use an interface I already was used to: e-mail on the command line using mutt. I decided to use the rss2email program, written by the great Aaron Swartz, to manage my feeds. This software pulls down RSS feeds and converts each story into its own e-mail message that it sends to you. This means you can use whatever e-mail program you want to read your feeds, but of course, because we are focusing on the command line here, I am going to talk about only mutt.

Installation and Configuration

The rss2email program already had Debian packages, so on my system, installing it was as easy as typing apt-get install rss2email. If for some reason it isn't packaged for your distribution, follow the steps on http://www.allthingsrss.com/rss2email/getting-started-with-rss2email to download and extract the rss2email tarball. This is Python software, so you will need Python 2.x on the system as well as some sort of local Sendmail program (Postfix or Exim works as well), or alternatively, you'll need to identify an outbound mail server you can use to send these e-mail messages.

Once rss2email is installed, you interface with it via the r2e command. To set up a new rss2email database containing your feeds, type:


$ r2e new youremail@yourdomain.net

Note that the e-mail address you use here will be the e-mail to which rss2email will send the e-mail messages. Once the database is set up, it's time to add feeds to it. You can do that with:


$ r2e add http://feed.someurl.com/rss

Or, if you happen to have an OPML file (such as you may have exported from Google Reader when you jumped ship), you can import that:


$ r2e opmlimport file.opml

At any point, you also can export all of your configured feeds from rss2email as an OPML file:


$ r2e opmlexport

Once you have added some feeds, you will want to poll them for new stories. Now the first time you run r2e against these feeds, it will pull in all stories in the feed, which probably includes some you already have seen. If you want to avoid that, the first time you will want to run:


$ r2e run --no-send

Otherwise, run:


$ r2e run

The first time it may take a while, because it reads all of your feeds and generates e-mail. Of course, by default, it will send all the stories to your INBOX, so because I control my own mail server, I created a special e-mail address for rss2email to use and then set up a procmail rule so that it forwarded all e-mail messages sent to that address to a special rss mailbox.

Of course, rss2email updates your feeds only when you run the command, so you probably will want to run this within cron so it updates automatically. Just run crontab -e as your regular user and add:


* *     * * *    r2e run 2>/dev/null

This will run r2e every minute and output any random error (such as when feeds are temporarily down) to /dev/null instead of sending you e-mail every time. For the most part, rss2email works as is, but in my case, I wanted to change two extra settings. To do this, just open up ~/.rss2email/config.py in a text editor and add the following settings:


HTML_MAIL = 1
DATE_HEADER = 1

The first setting tells rss2email to send the e-mail as HTML mail, and the second dates the message based on the date of the news story, not the date it picked it up. Although you might be surprised that I opt for HTML e-mail in my text-based e-mail client, mutt automatically converts HTML e-mail to text for me. Plus, when I tell mutt to open the e-mail in an external viewer, on pure shell sessions, it means I can view the full article in a text-based Web browser, such as w3m, very easily. And, when I run mutt on a machine with a Web browser, it can open the full article there instead.

Managing Feeds

Managing feeds in rss2email is relatively straightforward. First type r2e list to see a numbered list of all of your feeds. You will use the number associated with a feed to manage it. For instance, to get rid of a feed numbered 12, you would type:


$ r2e delete 12

You also can pause feeds if you want to ignore them temporarily with r2e pause number and then unpause it with r2e unpause number.

Mutt as the Front End

What makes this set up work so well for me is that I can use my regular mail program, mutt, to view my feeds. This means I quickly can scan my feeds and skip uninteresting or duplicate stories. In my case, I found I needed to tweak how mutt displays the index for this mailbox so I more easily could see who the feed was from. By default, mutt sets the index_format to:


set index_format="%4C %Z %{%b %d} %-15.15L (%4l) %s"

So, I set up a folder-hook so that when I'm in the rss mailbox, I get a slightly tweaked index:


folder-hook rss 'set index_format="%4C %Z %{%b %d} %-20.20f  %s"'

The main change I made was to remove the column that displayed the size of the message (%4l), and had it display the complete FROM: line of the e-mail and gave myself a little extra room in that column (%-20.20f). The result is a much more readable feed list, as you can see in Figure 1.

Figure 1. RSS Feeds inside Mutt

Although it's true that I miss out on images within my news feeds, in many cases, that just means I miss out on ads or clipart. When I run mutt from a machine with a Web browser and view the HTML, it opens the e-mail inside the Web browser, and from there, I can view images if I want. For those feeds that post only a summary of the article, I can follow the hyperlinks to the main article and read it in full.

Viewing my news this way may not appear to be as full-featured as with using a Web-based client, but it's a lot more flexible. Plus, in my case, I often have hundreds of stories to pore through, so viewing just the text versions of stories helps me focus on what's most important—the data.

______________________

Kyle Rankin is a director of engineering operations in the San Francisco Bay Area, the author of a number of books including DevOps Troubleshooting and The Official Ubuntu Server Book, and is a columnist for Linux Journal.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState