Work the Shell - Simple Scripts to Sophisticated HTML Forms
Last month, we looked at how to convert an HTML form on a page into a shell script with command flags and variables that let you have access to all the features of the search box. We tapped into Yahoo Movies and are building a script that offers up the key capabilities on the search form at movies.yahoo.com/mv/advsearch.
The script we built ended up with this usage statement:
USAGE: findmovie -g genre -k keywords -nrst title
So, that gives you an idea of what we're trying to do. Last month, we stopped with a script that offered the capabilities above and could open a Web browser with the result of the search using the open command.
Now, let's start with a caveat: open is a Mac OS X command-line script that lets you launch a GUI app. Just about every other Linux/UNIX flavor has a similar feature, including if you're running the X Window System. In fact, with most of them, it's even easier. A typical Linux version of “open a Web browser with this URL loaded” might be as simple as:
firefox http://www.linuxjournal.com/ &
That's easily done, even in a shell script.
Actually, if you're going to end a script by invoking a specific command, the best way to do it is to “exec” the command, which basically replaces the script with the app you've specified, so it's not still running and doesn't even need to exit. So in that case, it might look like exec firefox "$url" as the last line of the script.
This month, I want to go back and make our script do more interesting things. For now, an invocation like:
./findmovie.sh -g act evil
produces a command from the last few lines in the script:
echo $baseurl${params}\&p=$pattern
exec open -a safari "$baseurl${params}\&p=$pattern"
that ends up pushing out this:
http://movies.yahoo.com/mv/ ↪search?yr=all&syn_match=all&adv=y&type=feature&gen=act&p=evil
It's pretty sophisticated!
What if the user wants the option of dumping the data to the command line instead of launching a browser? We can address that by adding a -d dump command flag into the getopt block:
while getopts "dg:k:nrst" arg
do
case "$arg" in
d ) dump=1 ;;
g ) params="${params:+$params&}gen=$OPTARG" ;;
To dump the data, we'll enlist the powerful curl command, as we've done in the past. The program has zillions of options, but as we're just interested in the raw output, we can ignore them all (fortunately) except for --silent, which hides status updates, leaving the conditional:
if [ $dump -eq 1 ] ; then
exec /usr/bin/curl --silent "$baseurl${params}\&p=$pattern"
else
exec open -a safari "$baseurl${params}\&p=$pattern"
fi
But, that generates a huge amount of data, including all the HTML needed to produce the page in question. Let's spend just a minute looking closely at that output and see if there's a way to trim things at least a bit.
It turns out that every movie title that's matched includes a link to the movie's information on the Yahoo Movies site. Those look like:
<a href="http://movies.yahoo.com/movie/1809697875/info">Resident Evil
So, that's easy to detect. Better, we can use a regex expression with grep and skip a lot of superfluous data too:
cmd | grep '/movie/.*info'
That comes close to having only the lines that match individual movies, but to take this one step further, let's remove the false matches for dvdinfo, because we're not interested in the links to DVD release info. That's a grep -v:
cmd | grep '/movie/.*info' | grep -v dvdinfo
Now, let's have a quick peek at comedies that have the word “funny” in their titles:
./findmovie.sh -d -g com funny | grep '/movie/.*info' ↪| grep -v dvdinfo | head -3 <td><a href="http://movies.yahoo.com/movie/1810041785/info"> <b>Funny</b> People (2009)</a><br> <td><a href="http://movies.yahoo.com/movie/1809406735/info">What's So <b>Funny</b> About Me? (1997)</a><br> <td><a href="http://movies.yahoo.com/movie/1808565885/info">That <b>Funny</b> Feeling (1965)</a><br>
Okay, so the first three films in that jumble of HTML are Funny People, What's So Funny About Me? and That Funny Feeling.
From this point, you definitely can poke around and write some better filters to extract the specific information you want. The wrinkle? Like most other sites, Yahoo Movies chops the results into multiple pages, so what you'd really want to do is identify how many pages of results there are going to be and then grab the results from each, one by one. It's tedious, but doable.
Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




22 min 16 sec ago
5 hours 35 min ago
8 hours 47 min ago
11 hours 2 min ago
11 hours 30 min ago
12 hours 29 min ago
13 hours 57 min ago
15 hours 6 min ago
15 hours 52 min ago
22 hours 28 min ago