Work the Shell - Converting HTML Forms into Complex Shell Variables

 in
Web browser? We don't need no stinkin' Web browser for submitting HTML forms, that's what the shell is for.
Building the Full URL

There's a hiccup waiting to bite us with the code in its current state though. The problem is, what if the user specifies two words in the keywords value field or, worse, does so in the title field (remember, the last word or words are the title pattern, the core search for the Yahoo Movies system)?

The answer is that we need to convert spaces into symbols that are acceptable by the http system. That's easily done, fortunately:

params="$(echo $params | sed 's/ /+/g')"

It's not the most elegant solution, but it's certainly functional!

The bigger problem here is that Yahoo requires certain parameters actually be present to do a search. Choose a genre on the Web interface and click search, and you'll see that's not sufficient for it to proceed.

As a result, our base URL for searches is going to be a bit more complicated:


baseurl="http://movies.yahoo.com/mv/search"
baseurl="${baseurl}?yr=all&syn_match=all&"

Try that, and you'll find it doesn't work. Why? Because there are some hidden parameters that Yahoo has slipped into the form that are required to send to the search program. Without them, it just stops.

In fact, here's the baseurl value we need:


baseurl="http://movies.yahoo.com/mv/search"
baseurl="${baseurl}?yr=all&syn_match=all&adv=y&type=feature&"

Now, how do we put this all together? It's not so easy, because we still need to grab whatever's on the end of the invocation (the title pattern), then mask the spaces:

shift $(( $OPTIND - 1 ))

Hang on, let me explain this line before we go further. OPTIND contains the index into the positional parameters of the script, indicating the first parameter that wasn't absorbed by the getopts processing. Unfortunately, it's indexed from 1, and the options array is indexed starting at zero. The result? We have to subtract one from the value to be able to get the actual value with the $* notation:


params="$(echo $params | sed 's/ /+/g')"

pattern="$(echo $* | sed 's/ /+/g')"
echo URL: $baseurl${params}\&p=$pattern

Now, finally, armed with that, we can search for films that contain the word “love” and have reviews:


$ findmovie.sh -r love

URL: ...BASEURL...revs=1&p=love

Type that in, and you'll find it works fine, showing 80 films where “love” appears in the title and Yahoo Movies is aware of on-line reviews of the films.

Most Linuxes and other flavors of UNIX have a way that you can launch a Web browser from the command line, with the specified URL as its home. That's what we'll do:


echo $baseurl${params}\&p=$pattern
open -a safari "$baseurl${params}\&p=$pattern"

There are other things we can do now that we've converted the Yahoo advanced search form into a shell script, but we'll leave those for next month!

Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

______________________

Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A couple of points

ciotog's picture

Nice work, but I have a couple of quibbles:

1. vi vs Emacs detracts from the article - it's largely irrelevant

2. You use params="${params:+$params&}..." to keep off the leading '&' if params hasn't been defined yet, but then you stick params to a string (baseurl) that has a trailing '&', so it's unnecessary. Why not just use params=$params&... and leave the '&' off the end of baseurl? Or stick $params to the end, like so:

params="gen=$OPTARG&$params"
The url will have an extra '&' at the end but that's valid.
I recognize that the point is to teach these kinds of things, but to add them when not necessary is generally bad style.

3. Aside from that, in the paragraph where you explain the ${:+} notation you use a space character to illustrate it but it's probably not the best choice. Better to use a more visible character.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState