Work the Shell - Our Twitter Autoresponder Goes Live!
I can't believe it, this is my 52nd column. That means I've been writing for Linux Journal for almost four and a half years. Hopefully, you've been reading my column just as long and enjoying our monthly forays into the world of shell script programming. On the tech side, quite a bit has changed in the last four and a half years. But on the Linux/shell side, it's surprisingly similar to how it was when I wrote my first column.
Last month, we continued to build a Twitter autoresponder script that could read and parse Twitter messages (aka tweets). We got it working and wrapped up the column by realizing we actually needed to capture the unique tweet ID in addition to name and message, so we could ensure that the script kept track of what it had or hadn't answered.
The script keeps track of tweets by ID and knows both how to parse the incoming Twitter stream and how to remember if it has seen a one-word tweet request or not. Run it once, and I see:
Twitter user @jlight asked for the time @jlight the time on our server is LOCALTIME
The next time I run it, just a few minutes later, I see:
Twitter user @truss asked for the time @truss the time on our server is LOCALTIME Twitter user @tlady asked what our address in tweet 7395272164 @tlady we're located at 123 University Avenue, Anywhere USA
It looks good, but there's a problem in the script, because one of the output diagnostic lines is:
Twitter user @ asked for the time @ the time on our server is LOCALTIME
Somehow it's not identifying the user ID for this particular user. After a quick analysis of the actual Twitter.com data, it appears that the first tweet comes out of the parser section without an associated user ID.
To debug this, first get a copy of the script to follow along (the script from last month is at ftp.linuxjournal.com/pub/lj/listings/issue191/10695.tgz). In the while loop, I'll add this line to aid in debugging:
echo got name = $name, id = $id, and msg = $msg
Now when I run the script, here's what I see:
got name = , id = 7395437583, and msg = VERY cool got name = spin, id = 7395333666, and msg = time got name = astrong, id = 7395281516, and msg = time got name = truss, id = 7395281011, and msg = time
Clearly something's wrong, but what?
One reason I like to use temp files in scripts rather than having incredibly long and complicated pipes is for debugging this sort of problem.
Recall that the main parsing work is done by curl feeding its output to grep, then a sequence of sed invocations and finally a quick call to awk:
$curl -u "davetaylor:$pw" $inurl | \
grep -E '(<screen_name>|<text>|<id>)' | \
sed 's/@DaveTaylor //;s/ <text>//;s/<\/text>//' | \
sed 's/ *<screen_name>//;s/<\/screen_name>//' | \
sed 's/ *<id>//;s/<\/id>//' | \
awk '{ if (NR % 4 == 0) {
printf ("name=%s; ", $0)
}
else if (NR % 4 == 1) {
printf ("id=%s; ",$0)
}
else if (NR % 4 == 2) {
print "msg=\"" $0 "\""
}
}' > $temp
Adding the command more $temp immediately after this means we can eyeball the data stream and see what's different about the first and second lines (as the second is parsed properly). Here's what I see:
id=7395681235; msg="African or European?" name=jeffrey; id=7395672894; msg="North Hall IStage"
Note that there's no name= field on the first message. My theory? There's a logic error in the awk statement that's causing it to skip the first entry somehow.
To test that assumption, I'll temporarily replace the entire awk script with another that outputs the record number (mod 4) followed by the data line:
awk '{ print (NR % 4), $0 }' > $temp
The result is exactly what we were expecting, which is a bit confusing:
1 7395934047 2 we are at the MGM as well! 3 14171725 0 sideline 1 7395681235 2 African or European? 3 14712874 0 jeffrey
Here, Twitter user sideline has sent “we are at the MGM as well!”, and jeffrey sent the message “African or European?”.
Dave Taylor is the author of the popular Work the Shell column in Linux Journal.
Trending Topics
| Make TV Awesome with Bluecop | May 16, 2012 |
| Hack and / - Password Cracking with GPUs, Part I: the Setup | May 15, 2012 |
| An Introduction to Application Development with Catalyst and Perl | May 14, 2012 |
| Cryptocurrency: Your Total Cost Is 01001010010 | May 09, 2012 |
| HTML5 for Audio Applications | May 07, 2012 |
| May 2012 Issue of Linux Journal: Programming | May 02, 2012 |
- Hack and / - Password Cracking with GPUs, Part I: the Setup
- An Introduction to Application Development with Catalyst and Perl
- Validate an E-Mail Address with PHP, the Right Way
- Monitoring Hard Disks with SMART
- Readers' Choice Awards 2011
- Which one is the Best Free and Paid PDF editor for Mac
- Examining Load Average
- Bash Regular Expressions
- Building an Ultra-Low-Power File Server with the Trim-Slice
- Python for Android






1 hour 58 min ago
7 hours 35 min ago
10 hours 57 min ago
18 hours 45 min ago
1 day 7 hours ago
1 day 11 hours ago
1 day 16 hours ago
1 day 16 hours ago
1 day 18 hours ago
1 day 18 hours ago