Internet Radio to Podcast with Shell Tools

Combine several stand-alone programs with some shell “glue” and record your favorite Internet radio show while you sleep.

or even:


/usr/local/bin/mplayer -dumpaudio \
  -dumpfile "wolf.`/bin/date +%F`.mp3" \

But because I wanted the date for the filename, the tag and the RSS feed, I stored it in the $DATE shell variable. That makes it much easier to change the script around too. I now have several scripts that capture streams, and the only things that have to change are the variable assignments at the top.

Back-ticks are one of the shell's tools that allow us to merge simple commands into powerful assemblies. You can play with this more by using the echo command. Try, for example:

echo "wolf.`date +%F`.mp3"

to see what the filename would be in that last call to MPlayer.

We use the +%F formatting option to date, because the default date string is full of spaces. Also, my USA locale's date string has / characters in it—not the best thing to try to put inside a filename. Furthermore, the yyyy-mm-dd format means the files sort nicely by date when you list the directory. The RSS feed wants its date in RFC 822 format, so we wind up calling /bin/date three times in all.

Notice also that I'm giving the exact path to some of the executable commands. I do this so that when the script runs as a timed task, it won't have my personal shell's path settings. If you're unsure where a file lives, find it with which:

[phil@asylumhouse]$ which date
/bin/date

You're safe to leave off /bin and /usr/bin, but any other path should be specified explicitly, as should paths to any executable that exists as different versions in multiple locations.

The call to id3v2 tags the file as track 1 of 1, with proper author, album, title and year entries. The predefined genre number of 255 means Other. The --TCON entry fills in Radio in place of one of the predefined genres on any software that understands version 2 MP3 tags.

Lastly, the one-line Perl script at the end is a compressed version of this:


#!/usr/bin/perl

use XML::RSS; use XML::Simple;

$in=XMLin('/home/phil/wolfrss.xml');
$out=$in; # copy the parsed RSS file's tree
bless $out, XML::RSS; # make the copy an XML::RSS

# blessing doesn't copy the items.  Drat!
$item = $in->{channel}{item};
if ((ref $item) ne ARRAY) { # only one item in feed
  $out->add_item(%$item);
} else { # a list of items - foreach the list
  foreach $item (@{$item}) {
    $out->add_item(%$item);
  }
}

# Encoding doesn't transfer either.
$out->{encoding}='UTF-8';

# Date the file so client software knows it changed
$date = `date -R`;
$out->channel( lastBuildDate=>'$date',
    pubDate=>'$date');

# Add our newest captured file
$file = "/home/phil/wolfcaught.mp3";
$out->add_item( title => "Hour of the Wolf",
    link => $out->{'channel'}{'link'},
    pubDate => '$date',
    enclosure => { url=>"file://$file",
      length => (stat($file))[7],
      type => 'audio/mpeg'
    },
    mode => 'insert');

# Don't have more than 15 items in the podcast
while (@{$out->{'items'}} > 15) {
	pop(@{$out->{'items'}};
}

# Write out the finished file
$out->save('/home/phil/wolfrss.xml');"

Here I use XML::Simple to read and parse the existing .RSS file and XML::RSS to add our new item and write the modified version. The bless function tells Perl that the XML::Simple object $out now should be treated as an XML::RSS object. The only reason this does anything useful is the two modules use nearly identical variable names internally, derived from the tag names of the incoming RSS file.

This bless function copies over almost anything in the RSS file's header, but it doesn't bring over item or encoding tags. So I then copied over each item in a foreach loop, added today's date as the build and publication date and added the just-captured file as a new item. This item has a Web page link that is copied from the header, today's date as publication date and the all-important enclosure tag. The enclosure has a URL, in this case a file:// reference, because we are doing everything on the local filesystem. It also has a file length and a MIME type, audio/mpeg.

Shell variables replace all the quoted strings, and the super-sneaky shell variables $i, $o and $m get replaced by \$i, \$o and \$m. In other words, everywhere you see $i in the Perl script, the Perl interpreter actually gets the Perl variable name $i. Without that bit of substitution, the shell would replace each $i with a null string or, worse yet, whatever the shell variable i happened to hold before the script was executed. The reference to the actual MP3 file is a URL, file:///home/phil/wolf.2005-03-19.mp3, not merely a filename. When we enter the RSS feed file into Firefox or a feed aggregator program, we refer to it using URL notation as well, file:///home/phil/wolfrss.xml.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

This is helpful - but doesn't work... possibly out of date?

ejoso's picture

Does anyone follow this thread anymore? The most recent update being 2007, it's fast become a long tail article...

I've tried this and tinkered with it and the XML::RSS modules kick up pattern match errors for me and I've not been able to resolve them. It would be awesome if any of you fine folks who got this to work in the past could help me out with it...

The errors are:
Use of uninitialized value in pattern match (m//) at /usr/local/share/perl/5.10.1/XML/RSS.pm line 535.
Unregistered entity: Can't access modules field in object of class XML::RSS at /usr/local/share/perl/5.10.1/XML/RSS/Private/Output/Base.pm line 1131

Any thoughts? I pounded away at this for a loong time and wasn't able to get it working as of yet.

Thanks in advance,
Eric

something is broken...

Donovan Worrell's picture

Hi, thanks for the article on this it has helped me automate some recordings I need on a daily basis. A couple of things I noticed...I included the use strict;use warnings on the script you cited and have come up with some odd errors. I have worked through most of them but cannot figure out how to "fix" this:

Unregistered entity: Can't access modules field in object of class XML::RSS at /usr/lib/perl5/site_perl/5.8.5/XML/RSS/Private/Output/Base.pm line 926

any ideas? It seems to be centered around this:


bless\$out,XML::RSS;

based on this error:

Bareword "XML::RSS" not allowed while "strict subs" in use at -e line 1.
Bareword "ARRAY" not allowed while "strict subs" in use at -e line 1.

What am I missing here?

FIXED!!!

Donovan Worrell's picture

Hey,

Much thanks to the maintainers of XML::RSS specifically Shlomi Fish for this diff that fixes this script to work with XML::RSS current.

podcast.diff
--- podcast.sh.old 2007-10-11 20:06:41.077613900 +0200
+++ podcast.sh 2007-10-11 20:11:26.685655662 +0200
@@ -108,31 +108,19 @@

+use strict; \
+use warnings; \
use XML::RSS; \

-my \$in = XMLin('$XML'); \
-my \$out = \$in; \
-bless \$out, XML::RSS; \
-my \$item = \$in->{channel}{item}; \
-
-if ( ( ref \$item ) ne ARRAY ) { \
-
- \$out->add_item(%\$item); \
-} \
-else { \
- foreach \$item ( @{\$item} ) { \
- \$out->add_item(%\$item); \
- } \
-} \
+my \$out = XML::RSS->new(version => '2.0'); \
+\$out->parsefile('$XML'); \

\$out->channel( lastBuildDate => '$XDATE', pubDate => '$XDATE' ); \

... and the ugly version

Anonymous's picture

The ugly compressed version of the perl code is the following:

/usr/bin/perl -e "use XML::RSS; use XML::Simple; \
    $i=XMLin('$XML');$o=$i;bless $o,XML::RSS; \
    $m=$i->{channel}{item};if((ref $m)ne ARRAY) \
    {$o->add_item(%$m);} else \
    {foreach $m (@{$m}) {$o->add_item(%$m);}} \
    $o->channel(lastBuildDate=>'$XDATE', \
    pubDate=>'$XDATE'); \
    $o->add_item(title=>'$XTITLE', \
    link=>$o->{'channel'}{'link'}, \
    pubDate=>'$XDATE', \
    enclosure=>{url=>'file://$FILE', \
    length=>(stat('$FILE'))[7], \
    type=>'audio/mpeg'}, mode=>'insert'); \
    pop(@{$o->{'items'}}) \
    while (@{$o->{'items'}}>$ITEMS); \
    $o->{encoding}='UTF-8'; $o->save('$XML');"

Damn...

Anonymous's picture

I copied the wrong code block; sorry! It should be:

/usr/bin/perl -e "use XML::RSS; use XML::Simple; \
    $o=XML::RSS->new(version=>'2.0');
    $o->parsefile('$XML');
    $o->channel(lastBuildDate=>'$XDATE', \
    pubDate=>'$XDATE'); \
    $o->add_item(title=>'$XTITLE', \
    link=>$o->{'channel'}{'link'}, \
    pubDate=>'$XDATE', \
    enclosure=>{url=>'file://$FILE', \
    length=>(stat('$FILE'))[7], \
    type=>'audio/mpeg'}, mode=>'insert'); \
    pop(@{$o->{'items'}}) \
    while (@{$o->{'items'}}>$ITEMS); \
    $o->{encoding}='UTF-8'; $o->save('$XML');"

What does this achieve?

TRiG's picture

I'm not enough of a geek to understand this. (I'm a trainee geek.) Nor do I yet use Linux. (I may switch over when Microsoft drops XP support. I don't like the way XP tries to organise my life for me, and I'm told Vista is worse.)

With the above code you record a predetermined section of an Internet radio station (a programme), yes? And then you produce RSS code which creates a podcast feed of that program, yes? And then, on another computer, you set up a program (iTunes or something similar) to download that podcast, yes?

Do I understand you right? On computer 1 (always on the Internet), you record the programme and produce the podcast, and then on computer 2 (and potentially many other computers) (occasionally on the internet) you subscribe to that podcast. It seems a long-winded way of going about it, but I can see some benefits.

For those of us without access to always-online computers, is there any way we can set up such podcasts? Can we, somewhere, enter the URL of a live radio station (say http://www.cbc.ca/listen/streams/r1_toronto_32.html) and some times (say 19:00-20:00 on what I think is Eastern Time), and be given a resultant podcast feed to subscribe to?

(I live in Ireland, and don't have much experience with Canadian and US time zones. If you can understand the time expressions on the Vinyl Tap page, please explain them to me.)

Re: What does this achieve?

Phil Salkie's picture

You don't need two computers - the machine which you're saving the radio shows on makes an XML file so that podcast-aware programs can pick up the new radio shows as they're recorded, and automatically put them on to your music player at recharge/sync time. Maybe this will be an excuse to install Kubuntu on a spare hard drive partition and get your feet wet with Linux!

There's no service that I'm aware of that will make a podcast for you, basically for copyright reasons - perhaps someone in a less copyright-frenzied country will do that, and make a ton of money.

For Canadian times, look at: http://www.timetemperature.com/tzca/canada_time_zone.shtml
They seem to be assuming Eastern, AT is Atlantic, NT is Newfoundland.

For a possible way to do a similar thing on your XP machine, look at:
http://streamripper.sourceforge.net/
(You'll have to configure automatic dialing to the internet on your XP system for that to work.)

RDF, not RTF!

Evan Prodromou's picture

You said, "RSS stands for RTF Site Summary." Actually, it stands for "RDF Site Summary" -- the original name from the My Netscape Network. RDF is a framework for making statements about resources (like Web sites); see http://www.w3.org/RDF/ .

RTF is the "Rich Text Format", the default word processing exchange format used in WordPad and other word processors. It has nothing to do with RSS.

Re: RDF, not RTF!

Phil Salkie's picture

You are, of course, correct. Sadly, I've been caught perpetuating one of those errors like "to gild the lily" - so many people do it that it's become almost right. So, indeed, s/RTF/RDF/G9000

use fifo as intermediate wav file

Henk Postma's picture

You may save a lot of intermediate disk space by using a fifo buffer for the wav file. I use this for my particular version of a podcast generator:

#!/bin/bash
# call with $1=url, $2=mp3 file
# to stop recording, kill the mplayer process (killall mplayer)
# create unique name based on md5 hash of stream url and output mp3 file
output=/tmp/`echo $1 $2 | md5sum | awk '{print $1}'`
# make the fifo buffer
mkfifo "$output"
# start mplayer, dump the video (if any) to /dev/null
mplayer "$1" -ao pcm:file="$output" -vo null -vc dummy &
# and start transcoding from the fifo -> mp3 file
lame -S "$output" "$2"
rm "$output"

Kudos and extending the functionality

woodside's picture

I have hacked away on this to a point that it works pretty well for me. It am still having a few minor issues on the rss document ceation when I use the same script to record a few different programs throughout the week. It hasn't risen to the level of actually digging into it again, though. All in all, it works great.

I have been looking for a way to post-process the files to add bookmarks to the files, because one of the shows I record has 10 minute music breaks while they cut to local programming. I am looking for a way to extend this script by overlaying a bookmark. For example, 27 minutes into the show I want to insert a bookmark so that when they cut to music, I hit the forward button and advance to the next bookmark, which would be just before the cut back to the show. This assumes that the show is consistent with cuts, but that doesn't seem to be a problem.

Thanks for the excellent article.

I've brushed up the script

Rick's picture

I've brushed up the script a bit so that you only need one
and everything is pretty much passed from the crontab.

See it here

Cheers,

Rick

Link -

Wong Seuol's picture

Link is not working.. Thanks.. I would like to see ur scripted...

Thanks

Dead link

Kinney's picture

The link you refer to above appears to be dead. Would love to see the revised code.

Re: I've brushed up the script

Phil Salkie's picture

Nice job! Much cleaner than my hack-and-patch approach... Thanks!

Firefox Live Bookmarks and the enclosure tag

eric.john.miller's picture

Firefox live bookmarks do not appear to support the enclosure tag. Do you have a workaround for this?

I've been thinking about putting together something like this for a while. Thanks for a great article!

Re: Firefox Live Bookmarks and the enclosure tag

Phil Salkie's picture

Sure - it breaks the "link" feature on some news aggregators, but makes the live bookmarks work again. Sigh...

Change this bit of perl:
link=>$o->{'channel'}{'link'}, \

To this:
link=>'file://$FILE', \

With this change (assuming you have some useful plugin like Plugger or MPlayer Plugin) clicking on the live bookmark starts playing the captured file. This is really how it should have been in the article - it's much more useful than having the link to the homepage, plus that link is in the title section anyway.

Re: Firefox Live Bookmarks and the enclosure tag

eric.john.miller's picture

Thanks, that did the trick!

Mplayer?

tpurl's picture

Why not use streamripper instead? As long as you're writing your content to a file, streamripper works very well and requires fewer command switches.

Mplayer?

Sean Edwards's picture

Mplayer, ecasound, sox . . . there are many command-line audio tools. I prefer ecasound myself.

Re: Mplayer?

Phil Salkie's picture

(This comment got deleted because I'd managed to cross-up my user names... Here's my second try at it.)

Streamripper was one of the programs I looked at when I first tried to do time-based capture. I thought it would be the do-everything package that I wanted, but I found that it

1) Has almost no Linux documentation available from the website
2) Has a limited number of stream types that it can access
3) Has timed duration, but not timed start
4) Can't transcode (i.e. take a RealPlayer stream and save it as MP3)
5) Has ID3V2 file tagging, but it wasn't clear if you could tag with data that didn't come from the stream itself.

That being said, it seems to be a pretty capable package, and if it saves the shows you want in the format you want, great - you're absolutely right - it saves some messing around in the script, especially if you can get the MP3 tags to do what you want. But its main purpose seems to be to capture _songs_ from internet radio streams (like shoutcast), not whole _programs_.

So, it was sufficiently not what I was looking for that I opted to take the program which I was already using as a listener tool (MPlayer) and make it be what I wanted (a VCR for Internet Radio) by using a shell script. (Plus, that seemed to be a cool enough thing to do to write an article about.)

Thanks for the article

Joel's picture

Thanks for the article. It should be a good reference, even I don't need to do exactly what you're doing.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState