Work the Shell - Generating Turn-by-Turn Driving Directions
I'm happy to report that this month, I'm answering a reader's question about how to script something. Dunno what's up with the rest of you readers, but apparently writing to me with your weird and challenging shell-scripting puzzles isn't making the short list right now. Reader Paul M. asks:
Is there a way to screen-scrape Google Maps direction results? I'm after the text (turn left at Ho-ho-kus Blvd), not the maps. When I look at a saved results page, all I can see is CSS and JavaScript code. If I do a manual copy and paste of the directions, however, the turn-by-turn directions appear. Got any suggestions on how to grab turn-by-turn driving directions automatically, Dave?
Ah, those tricky programmers over at Google Maps make this pretty darn difficult! Poke around at the source pages generated by maps.google.com looking for directions, and it's clear that they're using a method=post or other advanced way to hide the starting and ending points from the URL itself, along with some very fancy coding to make the Web pages highly interactive. So to heck with it!
After much digging around and looking at how the different mapping sites work, I settled on Expedia.com as the best place to get driving directions so that we'll be able to specify start and stop points via URL and also understand the output. To get started, check out Expedia's interactive driving directions in your Web browser at www.expedia.com/Directions.
On Expedia, enter a starting and ending address for directions, and you'll find that it's all stored in a scary-complex URL like this: www.expedia.com/pub/agent.dll?qscr=mrdr&rtyp=0&unit=0&lats1=38.89872&lons1=-77.036379&alts1=5&strt1=1600+Pennsylvania+Ave+NW&city1=Washington&stnm1=DC&zipc1=20006®n1=0&labl1=1600+Pennsylvania+Ave+NW%2C%0AWashington%2C+DC+20006&lats2=28.393142483519902&lons2=-81.57198620078931&alts2=5&strt2=N+World+Dr&city2=Orlando&stnm2=FL&zipc2=32830®n2=0&labl2=World+Dr%2C%0AOrlando%2C+FL+32830&. (Eagle-eyed readers will notice that I'm offering the Obama family driving directions to Disney World.)
You can strip some of the superfluous information out of the URL and create a simple command-line call to get the map and directions:
start="strt1=1600+Pennsylvania+Ave+NW&city1=Washington&stnm1=DC" dest="strt2=N+World+Dr&city2=Orlando&stnm2=FL&zipc2=32830" curl --silent "http://www.expedia.com/pub/agent.dll?$start& ↪$dest&qscr=mrdr&rtyp=0&unit=0"
You can see that Expedia wants an address unwrapped and split by street address, city, state and zip code (though if it can figure out the location, it appears you can skip the zip code, as shown in start above).
Now that we have that, let's use sed to extract just the table of results, without the other superfluous information. This is done by manual analysis of the source file and noting that it's all in a table that starts with this HTML line:
<TABLE BORDER=1 BORDERCOLOR=#E4E4E4 CELLSPACING=0 CELLPADDING=4>
Not surprisingly, the line we seek that denotes the end of the table is </TABLE>. Here's the code that lets you slice things as desired:
sed -n '/BORDERCOLOR=#E4E4E4/,/<\/TABLE>/p'
Put them all together and save the output to a temp file. After that, the next challenge is to turn that HTML table into something you actually can read.
To do that, we're going to turn to a great open-source utility called Lynx. You might already have Lynx on your system, but if you don't, grab a copy of the Lynx text-based Web browser from lynx.isc.org. We'll use that to interpret and convert the HTML markup to raw text.
Fortunately, Lynx excels at this kind of challenge, as demonstrated by the working code:
curl --silent "http://www.expedia.com/pub/agent.dll?$start& ↪$dest&qscr=mrdr&rtyp=0&unit=0"| \ sed -n '/BORDERCOLOR=#E4E4E4/,/<\/TABLE>/p' | \ lynx -dump -stdin
Yup, that's it. Specify a correct start and destination, make sure that the script knows where to find Lynx on your system, and the output will look like this:
Directions Distance Time
Start: Depart Start on Local road(s) (East) 0.1 < 1min
1: Turn RIGHT (South) onto E Executive Ave NW 0.1 0:01
2: Turn LEFT (East) onto Alexander Hamilton Pl NW, then
immediately turn RIGHT (South) onto 15th St NW 0.1 0:01
3: Turn LEFT (East) onto Pennsylvania Ave NW, then immediately
turn RIGHT (South) onto 14th St NW 0.3 0:02
4: Keep STRAIGHT onto US-1 [14th St NW] 1.1 0:02
...
22: Take Ramp (LEFT) onto Western Way (Disney World) 1.9 0:02
23: Turn LEFT (North) onto Bear Island Rd 2.1 0:03
24: Turn RIGHT (East) onto Floridian Way 0.3 0:01
25: Keep STRAIGHT onto World Dr 0.4 0:01
End: Arrive End < 0.1 < 1min
Total Route 881 mi 13 hrs 2 mins
I'll leave it as an exercise to you, dear reader, to create a wrapper that prompts people for starting and ending addresses and then uses the curl invocation to Expedia and subsequent invocation of Lynx to display turn-by-turn driving directions.
Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Designing Electronics with Linux
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Reply to comment | Linux Journal
6 hours 31 min ago - Dynamic DNS
7 hours 5 min ago - Reply to comment | Linux Journal
8 hours 3 min ago - Reply to comment | Linux Journal
8 hours 53 min ago - Not free anymore
12 hours 55 min ago - Great
16 hours 42 min ago - Reply to comment | Linux Journal
16 hours 50 min ago - Understanding the Linux Kernel
19 hours 5 min ago - General
21 hours 35 min ago - Kernel Problem
1 day 7 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Expedia has changed to use
Expedia has changed to use POST method '-d' option with curl. You'd have to test if the same fields still work but I suspect they would.