Google vs. AllTheWeb

 in
Taking a look behind the recent hype over AllTheWeb.
______________________

Doc Searls is Senior Editor of Linux Journal

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

free seo analysis tool

instinctis's picture

Good article you've posted here, thanks, just to make a quick note to anyone interested in self SEO, that once you're done submiting your website to the important search engines, you could do a quick check for your website in http://ministatus.com and see the exact number of indexed pages or the number of backlinks (according to most important engines) ... more you could do a daily check and see the changes or just download your seo score in PDF file and hand it to someone who knows what to make of it ;-)

Re: Google vs. AllTheWeb

Anonymous's picture

One thinge to be said in favor of AllTheWeb: it is much better for specific-phrase searches that include common words such as "the" or "and". With Google, such words are not indexed, so you can't really use them to limit your search results.

Here's a good test. Suppose you're a fan of the band "The The". Enter the band name in Google, and you will get no results. Try it on AllTheWeb, and you'll get lots of results.

Re: Google vs. AllTheWeb

Anonymous's picture

f u

Re: Google vs. AllTheWeb

Anonymous's picture

If you want to search "The The" on Google you will have to do it this way "+The +The" to include the "stop" words.

(quotes used to emphasize query string only)

Re: Google vs AlltheWeb

Anonymous's picture

Bear in mind that Google has just (last two days of so) started recognising "stop words" in phrases. So an a to z of computers will probably only recognise the word COMPUTERS, but "an a to z of computers" (note the quotation marks) should recognise the whole lot. You could get the first version to work by entering +an +a +to +z +of computers
but even sticking in the + sign doesn't work on THE - although Google did announce that they may even include that word in the future.
Google has also announced that their index should be reindexed more frequently - perhaps not as often as the 9-12 days claimed by some engines, including (I believe) AllTheWeb, but not to be sniffed at. And of course, there is Google's Image search and non-html file coverage - both of which put everyone else in the shade. All of which makes me wonder - if Google is so good, how come I make extensive use of AllTheWeb? I love Google, but I still find AllTheWeb outperforms Google 35-40% of the time. It's not down to AllTheWeb's new query rewriting - I use that very sparingly since, as often as not, it completely wrecks the query I'm trying to post. Despite the enhanced News coverage at AllTheWeb, Moreover outperforms both of them for currency, and news.altavista.com offers by far the best archival news search. (Can't imagine that I would use AltaVista much for anything else, though). But when oh when will one of these great engines come up with the kind of flexibility that Northern Light has been offering for years? Full Boolean, end-truncation, internal single- and multi-character wildcards, nested parentheses, automatically re-running your search as an alert... fantastic! If Google or AllTheWeb start offering that kind of funcationality, that really will be the killer engine!

Re: Google vs. AllTheWeb

Anonymous's picture

"Geeks on the Half Shell":

Google: 1

AllTheWeb: 0

Moreover: 1 and it still crawls faster

http://www.moreover.com/cgi-local/page?o=portal&h=Search+results+for...+...

I took a look at a sample of FASTs stories--those returned searching for

'afghanistan' in english

at c 3.30 pm GMT 21 nov 2001--and compared their pick up times and relevance

with Moreover's profession. Of the top 10 results, one was a

duplicate, two were links to pages of links (not articles) on minor local US papers, and one an interactive guide to daisy cutter

bombs--not irrelevant but also not a top ten afghan news story. Of the

remaining six stories, Moreover picked up three of them, 15,3 and 13 hours earlier

than FAST, who also gave the BBC source name on one of these in russian not

english. Of the three that Moreover did not pick up, FAST picked two of them up 5,

and 23 hours after the site claimed that they had been posted (the third is

not time stamped on the site).

The top ten stories returned by a search for 'afghanistan' on Moreover were

all news stories, all links went directly to the story & the biggest gap

between the sites claimed posting time and Moreover's pick up time was 1 hr 55.

There was only one story that appeared in both. Like FAST, Moreover returned

stories from 7 different sources (counting sections of CNN as one), but

whereas 6 of Moreover's were original publications, only four of Allthe Web's

were.

so-- AlltheWeb:6/10 v Moreover:10/10 for relevance to 'top ten afghan stories' And thats not

factoring in the quality of sources.

Re: Google vs. AllTheWeb

Glennf's picture

I had the opportunity to write about both Google and Fast/Alltheweb.com for the New York Times in the last few weeks. Google's new document type indexing and HTML conversion of business docs (Word, PowerPoint, Excel, Lotus, etc., etc.) vastly expands the potential for a search engine to peer in the corners of the Web. Their count of 1.6 billion pages is probably too high, though: their duplicate removal isn't as aggressive as Fast's, and they count pages that they have just link text for: pages that they know exist only because of links on other sites.

Fast, on the other hand, turned its attention to beginners and news in the latest update a couple weeks ago. Their news engine is now superior to any other that I've found on the Web. They are spidering 3,000 sources several times an hour. They said that freshness was the focus of this latest update, but they hope to expand out to document types, too, and there's no technical reason that they can't.

Google started getting fresh this summer: try popular blog pages and see how recent the home page index is. Impressive, too.

Re: Google vs. AllTheWeb

Anonymous's picture

I've been using alltheweb for years, but I'm the only person I know who does so. Sometimes I find their results "better" than ones from google, at other times "worse".

One is however sure: when it comes to "bringing people to my site", alltheweb is of absolutely no importance, while google always ranks among "top refferers".

Deno (from mandrakeforum)

Re: Google vs. AllTheWeb

Anonymous's picture

I feel google is pretty faster than AllTheWeb in search. As on moment, google seems to better and faster search engine compared with other search engines.

Filesearch

Anonymous's picture

When I started searching the net, there was only one service I was interested in: Archie via telnet. What else then ftp-able file should I have looked for? Google is really lacking a 5th tab... Maybe in black? When the WWW thing started I switched to a webbased service provided by the university of Trondheim...

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState