Internet Radio to Podcast with Shell Tools
Once I had a script that did all I wanted it to do, I sent it in to LJ along with a first version of this article. LJ Editor in Chief Don Marti pointed out that I was missing one key component: my program was generating an RSS version 1.0 feed, but all the podcast-aware programs look for a version 2.0 feed—specifically for an XML tag named enclosure. Naturally, I assumed it would be a trivial change to my software, merely switching versions and adding the enclosure tag. I soon learned, however, that the XML::RSS Perl module can write RSS 2.0 but cannot read it. Several sleepless nights ensued, until I determined that Perl tools were available that could read RSS 2.0 but not write it. So, it was time to add some glue.
I started by adding two Perl modules to my system—you can install them (as root) with:
perl -e "install XML::RSS,XML::Simple" -MCPAN
You probably will be okay with answering any questions it asks with the default. If you haven't used the Comprehensive Perl Archive Network (CPAN) yet, it asks quite a few setup questions, such as choosing several mirror sites that are close to you. Otherwise, it simply asks about a dependency or two; say yes.
After the two modules and their required dependencies are installed, you need to create a new XML file with information about the show you want to capture. The great thing about XML is you can use any text editor to make a file that is readable by both humans and machines, making it easy to create, view, test and modify RSS feed files. Let's start with this skeleton, containing a basic title section:
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>Hour of the Wolf</title> <link>http://www.hourwolf.com</link> <description>Science Fiction Talk Radio with Jim Freund</description> <generator>WBAI Stream Capture using Linux shell tools</generator> </channel> </rss>
If you never have played with XML before, this is a good time to get your feet wet. A quick look at the file shows data items surrounded by HTML-like tags, where each <something> tag has a corresponding </something> to close the something section. This becomes more confusing later, though, when we add the alternate syntax, which looks like <tagname a=“A” b=“B” />.
Once I had gathered all the tools I needed, I added a few droplets of shell magic to arrive at this simple script:
#!/bin/bash
# catchthewolf - capture "Hour of the Wolf"
# For capturing the stream
DATE=`date +%F` # Save the date as YYYY-MM-DD
YEAR=`date +%Y` # Save just the year as YYYY
FILE=/home/phil/wolf.$DATE.mp3 # Where to save it
STREAM=http://www.2600.com/wbai/wbai.m3u
DURATION=2.1h # enough to catch the show, plus a bit
#DURATION=30s # a quick run, just for testing
# For the RSS syndication
XML="/home/phil/wolfrss.xml" # file for the RSS feed
ITEMS=15 # Maximum items in RSS list
XTITLE="Hour of the Wolf - $DATE Broadcast"
XDATE=`date -R` # Date in RFC 822 format for RSS
i=\$i;o=\$o;m=\$m # replace "$" in the perl script
# For the id3v2 Tags
AUTHOR="Jim Freund"
ALBUM="WBAI Stream Rip"
TITLE="Hour of the Wolf - $DATE"
# Use mplayer to capture the stream
# at $STREAM to the file $FILE
/usr/local/bin/mplayer -really-quiet -cache 128 \
-dumpfile $FILE -dumpaudio -playlist $STREAM &
# the & turns the capture into a background job
sleep $DURATION # wait for the show to be over
kill $! # end the stream capture
# Tag the resulting captured .mp3
id3v2 -a "$AUTHOR" -A "$ALBUM" \
-t "$TITLE" -y $YEAR -T 1/1 -g 255 \
--TCON "Radio" $FILE
# Add a new entry in the rss file,
# keep the file to a max of $ITEMS entries,
# and change the file's date to right now.
/usr/bin/perl -e "use XML::RSS; use XML::Simple; \
$i=XMLin('$XML');$o=$i;bless $o,XML::RSS; \
$m=$i->{channel}{item};if((ref $m)ne ARRAY) \
{$o->add_item(%$m);} else \
{foreach $m (@{$m}) {$o->add_item(%$m);}} \
$o->channel(lastBuildDate=>'$XDATE', \
pubDate=>'$XDATE'); \
$o->add_item(title=>'$XTITLE', \
link=>$o->{'channel'}{'link'}, \
pubDate=>'$XDATE', \
enclosure=>{url=>'file://$FILE', \
length=>(stat('$FILE'))[7], \
type=>'audio/mpeg'}, mode=>'insert'); \
pop(@{$o->{'items'}}) \
while (@{$o->{'items'}}>$ITEMS); \
$o->{encoding}='UTF-8'; $o->save('$XML');"
echo "Caught the wolf."
This doesn't look too simple, though. Let's dissect this script a bit to see how it all works. Notice the back-ticks (`) around the date commands. They take whatever is enclosed in the `` marks and run it as a command and then replace the entire `whatevercommand` with the output from that command. If I had needed the date only once, I could have written:
FILE=wolf.`/bin/date +%F`.mp3
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Designing Electronics with Linux
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- Build a Skype Server for Your Home Phone System
- Validate an E-Mail Address with PHP, the Right Way
- A Topic for Discussion - Open Source Feature-Richness?
- Why Python?
- Tech Tip: Really Simple HTTP Server with Python
- Great
2 hours 4 min ago - Reply to comment | Linux Journal
2 hours 12 min ago - Understanding the Linux Kernel
4 hours 26 min ago - General
6 hours 56 min ago - Kernel Problem
16 hours 59 min ago - BASH script to log IPs on public web server
21 hours 26 min ago - DynDNS
1 day 1 hour ago - Reply to comment | Linux Journal
1 day 1 hour ago - All the articles you talked
1 day 3 hours ago - All the articles you talked
1 day 4 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
This is helpful - but doesn't work... possibly out of date?
Does anyone follow this thread anymore? The most recent update being 2007, it's fast become a long tail article...
I've tried this and tinkered with it and the XML::RSS modules kick up pattern match errors for me and I've not been able to resolve them. It would be awesome if any of you fine folks who got this to work in the past could help me out with it...
The errors are:
Use of uninitialized value in pattern match (m//) at /usr/local/share/perl/5.10.1/XML/RSS.pm line 535.
Unregistered entity: Can't access modules field in object of class XML::RSS at /usr/local/share/perl/5.10.1/XML/RSS/Private/Output/Base.pm line 1131
Any thoughts? I pounded away at this for a loong time and wasn't able to get it working as of yet.
Thanks in advance,
Eric
something is broken...
Hi, thanks for the article on this it has helped me automate some recordings I need on a daily basis. A couple of things I noticed...I included the use strict;use warnings on the script you cited and have come up with some odd errors. I have worked through most of them but cannot figure out how to "fix" this:
Unregistered entity: Can't access modules field in object of class XML::RSS at /usr/lib/perl5/site_perl/5.8.5/XML/RSS/Private/Output/Base.pm line 926
any ideas? It seems to be centered around this:
bless\$out,XML::RSS;
based on this error:
Bareword "XML::RSS" not allowed while "strict subs" in use at -e line 1.
Bareword "ARRAY" not allowed while "strict subs" in use at -e line 1.
What am I missing here?
FIXED!!!
Hey,
Much thanks to the maintainers of XML::RSS specifically Shlomi Fish for this diff that fixes this script to work with XML::RSS current.
podcast.diff
--- podcast.sh.old 2007-10-11 20:06:41.077613900 +0200
+++ podcast.sh 2007-10-11 20:11:26.685655662 +0200
@@ -108,31 +108,19 @@
+use strict; \
+use warnings; \
use XML::RSS; \
-my \$in = XMLin('$XML'); \
-my \$out = \$in; \
-bless \$out, XML::RSS; \
-my \$item = \$in->{channel}{item}; \
-
-if ( ( ref \$item ) ne ARRAY ) { \
-
- \$out->add_item(%\$item); \
-} \
-else { \
- foreach \$item ( @{\$item} ) { \
- \$out->add_item(%\$item); \
- } \
-} \
+my \$out = XML::RSS->new(version => '2.0'); \
+\$out->parsefile('$XML'); \
\$out->channel( lastBuildDate => '$XDATE', pubDate => '$XDATE' ); \
... and the ugly version
The ugly compressed version of the perl code is the following:
/usr/bin/perl -e "use XML::RSS; use XML::Simple; \ $i=XMLin('$XML');$o=$i;bless $o,XML::RSS; \ $m=$i->{channel}{item};if((ref $m)ne ARRAY) \ {$o->add_item(%$m);} else \ {foreach $m (@{$m}) {$o->add_item(%$m);}} \ $o->channel(lastBuildDate=>'$XDATE', \ pubDate=>'$XDATE'); \ $o->add_item(title=>'$XTITLE', \ link=>$o->{'channel'}{'link'}, \ pubDate=>'$XDATE', \ enclosure=>{url=>'file://$FILE', \ length=>(stat('$FILE'))[7], \ type=>'audio/mpeg'}, mode=>'insert'); \ pop(@{$o->{'items'}}) \ while (@{$o->{'items'}}>$ITEMS); \ $o->{encoding}='UTF-8'; $o->save('$XML');"Damn...
I copied the wrong code block; sorry! It should be:
/usr/bin/perl -e "use XML::RSS; use XML::Simple; \ $o=XML::RSS->new(version=>'2.0'); $o->parsefile('$XML'); $o->channel(lastBuildDate=>'$XDATE', \ pubDate=>'$XDATE'); \ $o->add_item(title=>'$XTITLE', \ link=>$o->{'channel'}{'link'}, \ pubDate=>'$XDATE', \ enclosure=>{url=>'file://$FILE', \ length=>(stat('$FILE'))[7], \ type=>'audio/mpeg'}, mode=>'insert'); \ pop(@{$o->{'items'}}) \ while (@{$o->{'items'}}>$ITEMS); \ $o->{encoding}='UTF-8'; $o->save('$XML');"What does this achieve?
I'm not enough of a geek to understand this. (I'm a trainee geek.) Nor do I yet use Linux. (I may switch over when Microsoft drops XP support. I don't like the way XP tries to organise my life for me, and I'm told Vista is worse.)
With the above code you record a predetermined section of an Internet radio station (a programme), yes? And then you produce RSS code which creates a podcast feed of that program, yes? And then, on another computer, you set up a program (iTunes or something similar) to download that podcast, yes?
Do I understand you right? On computer 1 (always on the Internet), you record the programme and produce the podcast, and then on computer 2 (and potentially many other computers) (occasionally on the internet) you subscribe to that podcast. It seems a long-winded way of going about it, but I can see some benefits.
For those of us without access to always-online computers, is there any way we can set up such podcasts? Can we, somewhere, enter the URL of a live radio station (say http://www.cbc.ca/listen/streams/r1_toronto_32.html) and some times (say 19:00-20:00 on what I think is Eastern Time), and be given a resultant podcast feed to subscribe to?
(I live in Ireland, and don't have much experience with Canadian and US time zones. If you can understand the time expressions on the Vinyl Tap page, please explain them to me.)
Re: What does this achieve?
You don't need two computers - the machine which you're saving the radio shows on makes an XML file so that podcast-aware programs can pick up the new radio shows as they're recorded, and automatically put them on to your music player at recharge/sync time. Maybe this will be an excuse to install Kubuntu on a spare hard drive partition and get your feet wet with Linux!
There's no service that I'm aware of that will make a podcast for you, basically for copyright reasons - perhaps someone in a less copyright-frenzied country will do that, and make a ton of money.
For Canadian times, look at: http://www.timetemperature.com/tzca/canada_time_zone.shtml
They seem to be assuming Eastern, AT is Atlantic, NT is Newfoundland.
For a possible way to do a similar thing on your XP machine, look at:
http://streamripper.sourceforge.net/
(You'll have to configure automatic dialing to the internet on your XP system for that to work.)
RDF, not RTF!
You said, "RSS stands for RTF Site Summary." Actually, it stands for "RDF Site Summary" -- the original name from the My Netscape Network. RDF is a framework for making statements about resources (like Web sites); see http://www.w3.org/RDF/ .
RTF is the "Rich Text Format", the default word processing exchange format used in WordPad and other word processors. It has nothing to do with RSS.
Re: RDF, not RTF!
You are, of course, correct. Sadly, I've been caught perpetuating one of those errors like "to gild the lily" - so many people do it that it's become almost right. So, indeed, s/RTF/RDF/G9000
use fifo as intermediate wav file
You may save a lot of intermediate disk space by using a fifo buffer for the wav file. I use this for my particular version of a podcast generator:
#!/bin/bash
# call with $1=url, $2=mp3 file
# to stop recording, kill the mplayer process (killall mplayer)
# create unique name based on md5 hash of stream url and output mp3 file
output=/tmp/`echo $1 $2 | md5sum | awk '{print $1}'`
# make the fifo buffer
mkfifo "$output"
# start mplayer, dump the video (if any) to /dev/null
mplayer "$1" -ao pcm:file="$output" -vo null -vc dummy &
# and start transcoding from the fifo -> mp3 file
lame -S "$output" "$2"
rm "$output"
Kudos and extending the functionality
I have hacked away on this to a point that it works pretty well for me. It am still having a few minor issues on the rss document ceation when I use the same script to record a few different programs throughout the week. It hasn't risen to the level of actually digging into it again, though. All in all, it works great.
I have been looking for a way to post-process the files to add bookmarks to the files, because one of the shows I record has 10 minute music breaks while they cut to local programming. I am looking for a way to extend this script by overlaying a bookmark. For example, 27 minutes into the show I want to insert a bookmark so that when they cut to music, I hit the forward button and advance to the next bookmark, which would be just before the cut back to the show. This assumes that the show is consistent with cuts, but that doesn't seem to be a problem.
Thanks for the excellent article.
I've brushed up the script
I've brushed up the script a bit so that you only need one
and everything is pretty much passed from the crontab.
See it here
Cheers,
Rick
Link -
Link is not working.. Thanks.. I would like to see ur scripted...
Thanks
Dead link
The link you refer to above appears to be dead. Would love to see the revised code.
Re: I've brushed up the script
Nice job! Much cleaner than my hack-and-patch approach... Thanks!
Firefox Live Bookmarks and the enclosure tag
Firefox live bookmarks do not appear to support the enclosure tag. Do you have a workaround for this?
I've been thinking about putting together something like this for a while. Thanks for a great article!
Re: Firefox Live Bookmarks and the enclosure tag
Sure - it breaks the "link" feature on some news aggregators, but makes the live bookmarks work again. Sigh...
Change this bit of perl:
link=>$o->{'channel'}{'link'}, \
To this:
link=>'file://$FILE', \
With this change (assuming you have some useful plugin like Plugger or MPlayer Plugin) clicking on the live bookmark starts playing the captured file. This is really how it should have been in the article - it's much more useful than having the link to the homepage, plus that link is in the title section anyway.
Re: Firefox Live Bookmarks and the enclosure tag
Thanks, that did the trick!
Mplayer?
Why not use streamripper instead? As long as you're writing your content to a file, streamripper works very well and requires fewer command switches.
Mplayer?
Mplayer, ecasound, sox . . . there are many command-line audio tools. I prefer ecasound myself.
Re: Mplayer?
(This comment got deleted because I'd managed to cross-up my user names... Here's my second try at it.)
Streamripper was one of the programs I looked at when I first tried to do time-based capture. I thought it would be the do-everything package that I wanted, but I found that it
1) Has almost no Linux documentation available from the website
2) Has a limited number of stream types that it can access
3) Has timed duration, but not timed start
4) Can't transcode (i.e. take a RealPlayer stream and save it as MP3)
5) Has ID3V2 file tagging, but it wasn't clear if you could tag with data that didn't come from the stream itself.
That being said, it seems to be a pretty capable package, and if it saves the shows you want in the format you want, great - you're absolutely right - it saves some messing around in the script, especially if you can get the MP3 tags to do what you want. But its main purpose seems to be to capture _songs_ from internet radio streams (like shoutcast), not whole _programs_.
So, it was sufficiently not what I was looking for that I opted to take the program which I was already using as a listener tool (MPlayer) and make it be what I wanted (a VCR for Internet Radio) by using a shell script. (Plus, that seemed to be a cool enough thing to do to write an article about.)
Thanks for the article
Thanks for the article. It should be a good reference, even I don't need to do exactly what you're doing.