Still Searching
Water, water, every where, Nor any drop to drink. --Samuel Taylor Coleridge

Most of the information I crave is specific and textual; and since most specific text information involves more than one word in a row, I'm usually looking for alphanumeric strings. Sometimes, but not always, those strings are words.
In other words, I want a search engine to grep the world for me.
Since I'm sure this is no-hope-in-hell territory, I just went looking for a quote to express my frustration. I found it through Google. It's from the prolific Dmitry Kirsanov, writing in the “Advanced” chapters of HTML Unleashed, Professional Reference Edition, the full text of which is exposed online at http://www.webreference.com/dlab/books/html-pre/. He writes, “Those accustomed to grep-style regular expressions can't even dream of using something similar with search engines.”
We won't go into why. But we do have to ask if it's necessary for so many engines to be as bad as they are about this. It seems like every time I find a search engine that does The Job, somebody buys it and it goes to hell. Or it goes to hell and then somebody buys it. Whatever order, the bad news is always around the corner. And, without fail, it comes in the form of (here comes that dreaded word)...marketing.
Marketing can't seem to help trading utility for “reach”, ''exposure'', “targeting” or some other ''strategic'' abstraction intended to influence the largest possible population—ignoring the fact that today's common denominators aren't as low and wide as they used to be. Especially on the web.
The uncommon denominators (that's us again) are rather abundant, too. And it's not like marketing lives to ignore the connoisseur. Witness wars among automobile makers over handling characteristics only mechanics and race car drivers can fully understand. (How many of us careen our Chevy Tahoes down black diamond slopes or floor our Acuras to tire-melting speeds on 2-lane Nevada backroads?)
But dot-com marketing is a wacky breed that often lusts after consumers to a degree not seen outside Procter & Gamble, circa 1958. It was consumer-hungry marketing that killed Lycos (the original one that came from Carnegie Melon), then Infoseek, then Hotbot (as Inktomi created it), then Altavista (as DEC created it). Each of those was born to serve intellectual curiosity. Others—Looksmart, AskJeeves, Go, Yahoo! and DirectHit, to name a few—were born good for little other than flat-ass-dumb “consumer” searches for “favorites”, “portals” and whatnot.
Okay, I'll make an exception for Yahoo!, which has always used human beings to catalog the Web. And now it hired Google to do the heavy lifting, which is a good thing. We'll say more about Google after the Altavista autopsy.
I knew Altavista was terminal last fall, when its “advanced” search page was suddenly replaced by bragging about “improvements”. “Tips” were gone. So was a nice and easy way to search for inbound links to a given URL. If the function persisted, no clues to procedures remained (at least that I could find) amidst the fresh marketing poop.
Of course, there was a survey. So I filled it out. This came back by e-mail:
Thank you very much for recently filling out the survey on the AltaVista Advanced Search page. Your suggestions and comments will help us continue to make AltaVista Advanced Search the best way to location [sic] information on the Internet.
We've found that once users try Advanced Search, they realize the power of the tools that AltaVista has to make your searching more precise. So, we're always trying to come up with new ways to encourage regular searchers to try out Advanced Search. And, that's where we'd like your opinions.
As you'll recall, along with your survey responses you also submitted this email address. So if we had any follow-up questions we could contact you again. Well now we'd like your opinions about some promotions we'd like to use to convince users to give Advanced Search a try.
The six-question survey should take only a couple of minutes. Simply click on the hyperlink below. If you are using AOL or another email service that does not support hyperlinks, please copy the hyperlink and paste it into your browser window in the Address box.
The URL delivered me to a silly world whose gods believe that prizes might do what features won't. I wrote this in the Input box:
I don't want a prize. If your search is so damned “advanced” (and how can it be, now that you fail to even mention the wonderful “link:www.mysite.com -url:mysite.com” feature that only works in BASIC, fercrissake!), I'd be glad to help out for FREE. I'll give you my time if you'll give me your attention. I want truly advanced search functions. THAT's what tempts me. Not prizes.
At the bottom of page two, the system broke down and wouldn't advance me to page three. I gave up and didn't go back except to see how it compared with competition.
But hey, maybe they listened. The “cheat sheet” at doc.altavista.com/adv_search/syntax.shtml restores a lot of the good advice lost from the original Advanced page. But the sad fact is that Altavista isn't as good as it used to be. FAST, Google and even MSN Search yield better results. If you're looking for strings. (For more, see the feature in UpFront.)
I know because I test search engines pretty often. I go deep into a document somewhere in my own domain, http://www.searls.com/, and grab a string of text that appears both in my own site and in a number of others, such as a quote from literature. Then, I run a bunch of engines through the mill. That's how I knew when Altavista started to beat Infoseek, when Hotbot began to beat Altavista and when FAST began to beat Hotbot.
The last time I saved results was December 3, 1999. At that time, FAST, http://www.alltheweb.com/, won. They still win (see UpFront) on some tests but lose on others—usually to Google.
There's an awful lot to like about Google. First, they run their engine on Linux (in case you're asking, Google is a huge Red Hat customer). Second, their user interface is blessedly simple and devoid of hype for anything other than itself, and there's precious little of that. Design-wise, they make great use of white space. Third, they have nicely incorporated the DMOZ Open Directory project's catalog, which is essentially The People's Yahoo!. But fourth (and most importantly), they have done more than any other search company to allow trusting searches of both strings and collections of words at the same time.
Where FAST and Hotbot give you a pop-out menu choice of “the phrase”, “all the words” and “any of the words” (or the equivalents), Google does all of those at once, defaulting first to phrase mode. You can use quotes to narrow the results, but the difference is usually small. That means Google has done a good job of providing the largest possible narrow search. In fact, the one site users want comes out on top so often that Google confidently provides an “I'm Feeling Lucky” button that yields just one result.
Google does have special search functions, all well-explained. I wish they had more, but I can live without them. Google is, for me, by far the most useful and reliable search engine—at least for those hard-to-find word strings.
What's not to like about Google? Well, there's the Patent Issue. They're going after patents on their search methods (and who knows what else), which costs them good will in the Linux/Open Source community. At an event early this year, I talked briefly about patents with Google's co-founder, Larry Page. It was clear that Larry isn't crazy about them. Immediately afterwards, I talked with John Doerr. It was equally clear that John is crazy about patents—to the degree that he believes that patents are one of the things that “make America great.” John, of course, is a VC with Kleiner Perkins, which conspicuously funds Google.
There's also the risk that Google will pursue an advertising-driven business strategy. In current searches, ads show up as annotated, text-only links, posted above search results. These are certainly far less onerous than banners (also harder to block). But they quietly crept in a few months ago. What's the next step?
I don't know. In fact, I just tried to force Google's engine to give me an ad, and it wouldn't do it—not once in ten tries. So I suspect that the company is being cautious. As it should. Far as I know, Google is the only search engine created by and for people who live to search and search to live.
We need more of those. And we need the ones we've got to remember why we like them so much.

Doc Searls is Senior Editor of Linux Journal
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Home, My Backup Data Center
- A Topic for Discussion - Open Source Feature-Richness?
- Dart: a New Web Programming Experience
- Developer Poll
- What's the tweeting protocol?
- May 2013 Issue of Linux Journal: Raspberry Pi
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.




31 min 23 sec ago
1 hour 6 min ago
1 hour 28 min ago
6 hours 17 min ago
7 hours 4 min ago
8 hours 37 min ago
10 hours 14 min ago
12 hours 12 min ago
12 hours 29 min ago
12 hours 59 min ago