Still Searching
Water, water, every where, Nor any drop to drink. --Samuel Taylor Coleridge

Most of the information I crave is specific and textual; and since most specific text information involves more than one word in a row, I'm usually looking for alphanumeric strings. Sometimes, but not always, those strings are words.
In other words, I want a search engine to grep the world for me.
Since I'm sure this is no-hope-in-hell territory, I just went looking for a quote to express my frustration. I found it through Google. It's from the prolific Dmitry Kirsanov, writing in the “Advanced” chapters of HTML Unleashed, Professional Reference Edition, the full text of which is exposed online at http://www.webreference.com/dlab/books/html-pre/. He writes, “Those accustomed to grep-style regular expressions can't even dream of using something similar with search engines.”
We won't go into why. But we do have to ask if it's necessary for so many engines to be as bad as they are about this. It seems like every time I find a search engine that does The Job, somebody buys it and it goes to hell. Or it goes to hell and then somebody buys it. Whatever order, the bad news is always around the corner. And, without fail, it comes in the form of (here comes that dreaded word)...marketing.
Marketing can't seem to help trading utility for “reach”, ''exposure'', “targeting” or some other ''strategic'' abstraction intended to influence the largest possible population—ignoring the fact that today's common denominators aren't as low and wide as they used to be. Especially on the web.
The uncommon denominators (that's us again) are rather abundant, too. And it's not like marketing lives to ignore the connoisseur. Witness wars among automobile makers over handling characteristics only mechanics and race car drivers can fully understand. (How many of us careen our Chevy Tahoes down black diamond slopes or floor our Acuras to tire-melting speeds on 2-lane Nevada backroads?)
But dot-com marketing is a wacky breed that often lusts after consumers to a degree not seen outside Procter & Gamble, circa 1958. It was consumer-hungry marketing that killed Lycos (the original one that came from Carnegie Melon), then Infoseek, then Hotbot (as Inktomi created it), then Altavista (as DEC created it). Each of those was born to serve intellectual curiosity. Others—Looksmart, AskJeeves, Go, Yahoo! and DirectHit, to name a few—were born good for little other than flat-ass-dumb “consumer” searches for “favorites”, “portals” and whatnot.
Okay, I'll make an exception for Yahoo!, which has always used human beings to catalog the Web. And now it hired Google to do the heavy lifting, which is a good thing. We'll say more about Google after the Altavista autopsy.
I knew Altavista was terminal last fall, when its “advanced” search page was suddenly replaced by bragging about “improvements”. “Tips” were gone. So was a nice and easy way to search for inbound links to a given URL. If the function persisted, no clues to procedures remained (at least that I could find) amidst the fresh marketing poop.
Of course, there was a survey. So I filled it out. This came back by e-mail:
Thank you very much for recently filling out the survey on the AltaVista Advanced Search page. Your suggestions and comments will help us continue to make AltaVista Advanced Search the best way to location [sic] information on the Internet.
We've found that once users try Advanced Search, they realize the power of the tools that AltaVista has to make your searching more precise. So, we're always trying to come up with new ways to encourage regular searchers to try out Advanced Search. And, that's where we'd like your opinions.
As you'll recall, along with your survey responses you also submitted this email address. So if we had any follow-up questions we could contact you again. Well now we'd like your opinions about some promotions we'd like to use to convince users to give Advanced Search a try.
The six-question survey should take only a couple of minutes. Simply click on the hyperlink below. If you are using AOL or another email service that does not support hyperlinks, please copy the hyperlink and paste it into your browser window in the Address box.
The URL delivered me to a silly world whose gods believe that prizes might do what features won't. I wrote this in the Input box:
I don't want a prize. If your search is so damned “advanced” (and how can it be, now that you fail to even mention the wonderful “link:www.mysite.com -url:mysite.com” feature that only works in BASIC, fercrissake!), I'd be glad to help out for FREE. I'll give you my time if you'll give me your attention. I want truly advanced search functions. THAT's what tempts me. Not prizes.
At the bottom of page two, the system broke down and wouldn't advance me to page three. I gave up and didn't go back except to see how it compared with competition.
But hey, maybe they listened. The “cheat sheet” at doc.altavista.com/adv_search/syntax.shtml restores a lot of the good advice lost from the original Advanced page. But the sad fact is that Altavista isn't as good as it used to be. FAST, Google and even MSN Search yield better results. If you're looking for strings. (For more, see the feature in UpFront.)
I know because I test search engines pretty often. I go deep into a document somewhere in my own domain, http://www.searls.com/, and grab a string of text that appears both in my own site and in a number of others, such as a quote from literature. Then, I run a bunch of engines through the mill. That's how I knew when Altavista started to beat Infoseek, when Hotbot began to beat Altavista and when FAST began to beat Hotbot.
The last time I saved results was December 3, 1999. At that time, FAST, http://www.alltheweb.com/, won. They still win (see UpFront) on some tests but lose on others—usually to Google.
There's an awful lot to like about Google. First, they run their engine on Linux (in case you're asking, Google is a huge Red Hat customer). Second, their user interface is blessedly simple and devoid of hype for anything other than itself, and there's precious little of that. Design-wise, they make great use of white space. Third, they have nicely incorporated the DMOZ Open Directory project's catalog, which is essentially The People's Yahoo!. But fourth (and most importantly), they have done more than any other search company to allow trusting searches of both strings and collections of words at the same time.
Where FAST and Hotbot give you a pop-out menu choice of “the phrase”, “all the words” and “any of the words” (or the equivalents), Google does all of those at once, defaulting first to phrase mode. You can use quotes to narrow the results, but the difference is usually small. That means Google has done a good job of providing the largest possible narrow search. In fact, the one site users want comes out on top so often that Google confidently provides an “I'm Feeling Lucky” button that yields just one result.
Google does have special search functions, all well-explained. I wish they had more, but I can live without them. Google is, for me, by far the most useful and reliable search engine—at least for those hard-to-find word strings.
What's not to like about Google? Well, there's the Patent Issue. They're going after patents on their search methods (and who knows what else), which costs them good will in the Linux/Open Source community. At an event early this year, I talked briefly about patents with Google's co-founder, Larry Page. It was clear that Larry isn't crazy about them. Immediately afterwards, I talked with John Doerr. It was equally clear that John is crazy about patents—to the degree that he believes that patents are one of the things that “make America great.” John, of course, is a VC with Kleiner Perkins, which conspicuously funds Google.
There's also the risk that Google will pursue an advertising-driven business strategy. In current searches, ads show up as annotated, text-only links, posted above search results. These are certainly far less onerous than banners (also harder to block). But they quietly crept in a few months ago. What's the next step?
I don't know. In fact, I just tried to force Google's engine to give me an ad, and it wouldn't do it—not once in ten tries. So I suspect that the company is being cautious. As it should. Far as I know, Google is the only search engine created by and for people who live to search and search to live.
We need more of those. And we need the ones we've got to remember why we like them so much.

Doc Searls is Senior Editor of Linux Journal
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- New Products
- UX Designer
- Web & UI Developer (JavaScript & j Query)
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




4 hours 6 min ago
4 hours 7 min ago
6 hours 7 min ago
14 hours 52 min ago
15 hours 26 min ago
16 hours 25 min ago
17 hours 15 min ago
21 hours 17 min ago
1 day 1 hour ago
1 day 1 hour ago