Work the Shell - <emphasis>Mad Libs</emphasis> Generator, Part II
Last month, we dug in to creating a Mad Libs generator—a program that you could give a snippet of English prose, and then it would select words randomly and replace them with their parts of speech, so you could have friends or family create their own amusing alternatives.
So, instead of “the quick brown fox jumping over that lazy dog”, it could be “the quick (( adjective )) fox jumps over the (( adjective )) dog”, for example.
The problem is that selecting random words from a sentence also can produce something far more boring, like “(( definite article )) quick brown fox jumps over (( definite article)) lazy dog”.
This month, I take that random word-selection tool and add some smarts so that it is biased toward longer words and words that are nouns or adjectives.
Last month, you'll recall that our script had a word-selection snippet that looked like this:
while read sentence ; do
for word in $sentence ; do
if [ $(( $RANDOM % $density )) -eq 1 ] ; then
echo "(($word))"
else
echo $word
fi
done
Where we'll need to expand the code is within the conditional that currently just puts the word in parentheses. The first step is to analyze length: if the word is three or less letters long, we'll be much less likely to select it:
if [ $(( $RANDOM % $density )) -eq 1 ] ; then
length=$(/bin/echo -n $word | wc -c | sed 's/ //g')
if [ $length -lt 4 -a $(( $RANDOM % 2 )) -eq 1 ] ; then
echo \{$word\} # too short
else
echo "(($word))"
fi
else
This works pretty well—actually, every time a word is selected, its length is checked. Words less than four letters long have a 50% chance of being ignored. With a simple input sample, here's what we get:
{the} ((quick)) brown fox jumped ((over)) the lazy black dog
It's still not great, but at least it recognized that “the” wasn't interesting due to length. I'm still not entirely satisfied with which words it chooses to substitute, but let's move on to the second part of this project, testing part of speech, and come back to the selection criteria later.
The core code for this already was presented last month, utilizing Princeton's handy WordNet, so here it is:
pos="$(curl --silent "$dictionary$word" | grep '<h3>' | head -1 \
| tr '[:upper:]' '[:lower:]' | sed 's/<h3>//;s/<\/h3>//')"
if [ ! -z "$(echo $pos | grep "not return any results")" ] ; then
echo \[$word\] # failed to figure out part of speech
else
echo "((${word}:$pos))"
fi
Notice that we have to worry about failed lookups. Some words just aren't found in the WordNet dictionary, and we need to be prepared. I'll tie these together, as written, and here's what we get as an output:
Note: {} = too short, [] = POS undefined
((I:noun)) {am} {by} ((birth:noun)) {a} Genovese, and
{my} family {is} one of the most ((distinguished:verb))
of that ((republic:noun))
As the header reminds us, at this point, we're denoting words selected but skipped because they're too short with {} and those that have an undefined part of speech with [].
I've also changed the word replacement density factor to have more words tested. As you can see, most of the words in our sample input are now evaluated one way or the other.
Now, let's add a test so that only nouns or adjectives are eligible for substitution too:
if [ ! -z "$(echo $pos | grep "not return")" ] ; then
echo \[$word\] # failed to figure POS
else
if [ -n "$(echo $pos | grep -E '(noun|adjective)')" ] ; then
echo "((${word}:$pos))"
else
echo "<${word}:$pos>"
fi
fi
I'll give it that same first sentence to Mary Shelley's Frankenstein, and let's see what transpires:
Note: {} = too short, [] = POS undefined, <> = uninteresting POS
I {am} <by:adverb> birth {a} Genovese, [and] my
family ((is:noun)) {one} {of} {the} ((most:adjective))
<distinguished:verb> {of} [that] ((republic:noun))
We're definitely getting there, but I think we still need to add something to the selection criteria—something that will help us produce more interesting Mad Libs.
But, let's leave that for next month as we've already dug through a lot of code in this column.
Dave Taylor has been hacking shell scripts for a really long time, 30 years. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Reply to comment | Linux Journal
3 hours 38 min ago - Reply to comment | Linux Journal
3 hours 54 min ago - Favorite (and easily brute-forced) pw's
5 hours 45 min ago - Have you tried Boxen? It's a
11 hours 37 min ago - seo services in india
16 hours 8 min ago - For KDE install kio-mtp
16 hours 9 min ago - Evernote is much more...
18 hours 9 min ago - Reply to comment | Linux Journal
1 day 2 hours ago - Dynamic DNS
1 day 3 hours ago - Reply to comment | Linux Journal
1 day 4 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
Good Sharing
thanx good sharing nice post
Thanks for sharing a useful
Thanks for sharing a useful