Work the Shell - Simple Scripts to Sophisticated HTML Forms
Let's look at a more interesting subset, instead, by adding a -c flag to have it output just a count of how many films match the specified criteria, you've given the command instead.
To do that, we don't need to go page by page, but just identify and extract the value from the match count on the page. For the comedies with “funny” in the title, the line on the page looks like this: “< Prev | 1 - 20 of 37 | Next 17 >”.
What we need to do is crack the HTML and look at the source to the link to “next 17” and see if it's extractable (is that a word?):
./findmovie.sh -d -g com funny | grep -i "next 17" | head -1 <td align=right><font face=arial size="-2"><nobr> ↪< Prev | <b>1 - 20</b> ↪ of <b>37</b> | <span ↪class="yperlink"><ai href="/mv/search?p=funny&yr=all ↪&gen=com\&syn_match=all&adv=y&type=feature ↪&n=17&b=21&h=s">Next 17</a> > ↪ </nobr></span></span></font></td></tr>
Well that's ugly. You'd think Yahoo didn't want to make this easy or something! It turns out though that this is a pretty tricky task, because if there are no matches, the link doesn't show up, and instead you see “Sorry, no matches were found”. If there are less than 20 matches, you see “Next >”, but it's not a clickable link, so it's not going to be so easy!
Given that I'm out of space, let's defer this topic until next month. Meanwhile, look at the source to various searches yourself and see if anything comes to mind. Otherwise, it'll be brute force!
Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
- High-Availability Storage with HA-LVM
- DNSMasq, the Pint-Sized Super Dæmon!
- March 2015 Issue of Linux Journal: System Administration
- Localhost DNS Cache
- Real-Time Rogue Wireless Access Point Detection with the Raspberry Pi
- Days Between Dates: the Counting
- The Usability of GNOME
- PostgreSQL, the NoSQL Database
- Linux for Astronomers
- You're the Boss with UBOS