upFRONT

They Said It and more.
THEY SAID IT

“There is none. Get over it.” —Scott McNealy on privacy

“At this stage in my life, the thing that really turns me on is competence.”

—Billy Joel

“Science would be superfluous if the outward appearance and the essence of things directly coincided.”

—Karl Marx

“Certum est, quia impossibile. (It is certain, because it is impossible.)”

—Tertullianus

“What the hell is content? Nobody buys content. Real people pay money for music because it means something to them. Being a “content provider” is prostitution work that devalues our art and doesn't satisfy our spirits. Artistic expression has to be provocative. The problem with artists and the Internet: Once their art is reduced to content, they may never have the opportunity to retrieve their souls.''

—Courtney Love

“Today I want to talk about piracy and music. What is piracy? Piracy is the act of stealing an artist's work without any intention of paying for it. I'm not talking about Napster-type software. I'm talking about major label recording contracts.”

—Courtney Love

“Linux means never having to delete your love mail.”

—Don Marti

“Maturity is when you quit blaming other people for your problems.”

—Craig Burton

“Who is General Failure, and why is he reading my hard drive?”

—Bob (on Slashdot)

“I'm still freaked out by all this. I just wrote a (bleeping) anthropology paper.”

—Eric Raymond

“My hovercraft is full of eels.”

—Monty Python

Google Gains While FAST Keeps Pace

These days, search engines are praised for their educated guesswork. Each pile of results is presented with an implication: “Here's what we think you want.”

But most serious researchers (i.e., Yours Truly and all Linux Journal readers) often don't want an engine to guess. These users want to search for specific strings—or, as some search engines put it, phrases. These include names, text passages, lines of code, diseases and all other series of words.

On December 3, 1999 and June 27, 2000, I tested fifteen of the most familiar search engines by searching for a relatively unique phrase: “He that by me spreads a wider breast than my own”, which is part of a familiar line from Walt Whitman's Song of Myself. The phrase occurs far from the beginning of the work and can be found in many documents on the Web, including one on my own site, http://www.searls.com/.

It's a tough test. Only those engines that deeply search an enormous range of sites will yield results. The word “breast” is also a common and controversial word, which might invite spurious results.

Testing for strings also isn't easy on the tester, since search engines have different ways of recognizing phrases. Most require quotes. Others (Hotbot, FAST) have pop-out menu commands. One (Yahoo!) requires clicking a radio button in “advanced” search mode. Others (mostly at the bottom in the surveys) don't search phrases at all.

I run this test quite often for my own purposes and rarely record the results. But last December I did record them, and I repeated the test again on June 27—nearly seven months later. This time I added two more tests: one for an obscure blood disorder and the other for a Linux system call. Here are the results:

Figure 1. Test Phrase I

Figure 2. Test Phrase II

Figure 3. Test Phrase III

As you see, FAST won the first search, by a wide margin, just as it did in December. But Google (see Jason Schumaker's “Interview with Sergey Brin”, page ??) is the clear winner of the second and third searches—and didn't do too bad on the first, either.

Near as I could tell, none of the engines fell for the “breast” bait—at least not in phrase search mode. On that one, they all get a passing grade.

Some of the results, however, were amazing. Go.com found “29,024,074 matches” in the first search, but nothing I wanted. The first ten results all related to breastfeeding or breast cancer. No porn, of course. But I did get a banner ad featuring a happy-looking woman in a cleavage bra. “BREAST AUGMENTATION?” it asked. “Looking for breast augmentation from a doctor in your area? CLICK HERE.” Nice guess, guys.

As we can see, search is consolidating as a business category. By the time you read this, Yahoo! will be using Google (in a deal struck one day before this survey). Other partnerships and cross-investments are sure to follow (Lycos recently bought a 15% stake in FAST, for example). My guess is that there will be fewer search engines by the time you read this.

I just hope there aren't fewer good ones.

Doc Searls (doc@ssc.com) is Senior Editor of Linux Journal and co-author of The Cluetrain Manifesto.

______________________

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix