Quantcast
Username/Email:  Password: 

Making More Sense of the Web

What effect will blogs, syndication and their chronological approach to ordering have on the rest of the Web?



Editors' Note: The following is the text of the June 9 and June 23
editions of
Doc
Searls' SuitWatch newsletter
. Sign up to be a subscriber
of this bi-weekly newsletter.

I think one reason the Net and the Web are so
successful is they came without much of an
organizing framework. The only directory was the
domain name system (DNS), and it specified
nothing that comes after the first single /. The
rest was free to become as big a haystack as
anybody wanted to make it.

Which we did, with the help of search engines.
The haystack nature of the Web required search
engines. In effect, Google and Yahoo say, "Right,
the Web is a haystack, and we can help you find a
needle in there." Now, using search engines is so
much a part of life on the Web that most of us
don't lament the lack of a directory structure to
the place. Or the relative failures of Yahoo and
DMOZ to create library-like directories of the
Web's contents.

But what if some of the Web actually gets organized? What then?

There aren't but a few ways to organize things:
categorically, alphabetically, numerically,
chronologically, spatially, geographically...

We do have a crude sort of geographical
organization with country codes in DNS: .uk, .cz,
.jp and so on. Except in the US, where almost
nobody other than del.ic.io.us bothers to use the
.us code. But still, there hasn't been much to
compromise the haystack nature of the Web.

Until two phenomena came together: blogs and
syndication. Together they're creating a corner
of the Web--call it the syndisphere--that is
organized chronologically.

Blogs have a virtual directory path--http://[blogname]/year/month/day/date/post--where
the last item has its own permalink. This
is the directory nature if not structure
comprehended by the new search engines and
related services--Bloglines, Blogpulse,
Feedster, IceRocket, Pubsub and Technorati--that
look only at the part of the Web that's
syndicated through RSS feeds. Services such as
Technorati archive every post from every blog
with an RSS feed. To them every permalink is
actually permanent. (Disclosure: Technorati was
born while its founder, David Sifry, and I worked
together on a story about blogging for Linux
Journal
, and I'm on the company's advisory board.)

The same isn't true for Google or Yahoo. The
indexes of those search engines are inventories
of what's on the Web right now. The perspective
isn't chronological or any similarly structured
context. It's all haystack. Which is fine. They
do a miraculous job. And on open-source
infrastructure, no less.

But the emergence of a chronological corner of
the Web is an interesting phenomenon, one that
grew naturally, from the bottom up. No big
company said This Will Be So. Instead, this new
subsphere emerged on its own, naturally.
Now, another interesting natural phenomenon is showing up: tagging.

As with RSS, tagging brings a new organizing
principle to bear, at least on its own corner of
the Web. Tags are labels individuals can apply to
anything, through HTML. The first places tags
appeared were del.ic.io.us and Flickr. The former
is a social bookmarks manager, and the latter is
a photo archiving service, though neither label
does either service justice. What matters about
both is their value comes primarily from
user contributions. Just about everything you see
on both services is what users put there.

And tags are a big part of it. At Flickr, you
are asked to tag, essentially to label with
membership in a user-definedcategory, every photo
you put up there.

The practice has spread to blogging. Many
blogs now add the rel="tag" element to their
links or append "tags" or "Technorati tags" to
their posts.

The rel="tag" spec is described at the
Microformats
wiki
.
The editor and author of the wiki is Tantek Çelik. Derek
Powazek and Kevin Marks are credited under the
Concept heading. All three work at Technorati.

I recently did an IM interview with Tantek, to
deepen my own understanding of what tags are
about:

Doc Searls: Technical question: Who started the whole
tagging thing. Delicious? Flickr? both? I know
T'rati was the first to search it. Then when/how
did rel="tag" come along?

Tantek Çelik: Technorati invented rel="tag" and distributed/decentralized tagging.

DS: So tagging was internal to the Delicious and Flickr silos before that?

TÇ: Yes.

DS: What's the difference between a tag and a
Technorati tag? The latter isn't proprietary
except...that's how it sounds.

TÇ: We've been calling them "rel tags" for
exactly that reason: to make it clear.

DS: rel means what, exactly?

TÇ: rel means the relationship between the
current document (or large portion thereof) and
the href that the hyperlink points to. The way it
labels this relationship is in terms of a noun
describing the resource at the href. The best
illustrative example of this is rel="stylesheet"
that's used to indicate that the href over there
is a stylesheet for the current document. Another
great example of this is rel="license", specified
here: http://microformats.org/wiki/rel-license.
rel="license" means is rel="whatever" part of the
w3c or whatever spec this href over here (e.g. a
link to a CC or Apache or GPL license page) is a
license for the current page.

DS: and rel is standard HTML?

TÇ: resl is a standard HTML4 attribute as defined
by the W3C in the HTML4 specification, which
*also* states that authors may use their own rel
values and may define them using a profile.

DS: How does Technorati search Flickr and Delicious? Are "posts" there rss-fed?

TÇ: That's where XMDP (XHTML Meta Data Profiles)
comes in. XMDP is a format for defining such
profiles. See
this explanation.
Technorati shows tag results from Flickr and
Delicious using their RSS feeds.

DS: So it's still in the framework, or practice, of RSS-activated search.

TÇ: Yes.

Hugh MacLeod, the marketing iconoclast whose
Gapingvoid
cartoons I once described as "Dilbert for people whose jobs don't
suck", has lately taken an interest in tags as
well. He and his friend Sig have invented a
"tree-structure-free gizmo" called
Thingamy,
which they say is "basically a different approach to
organising data, finding data, and transferring
knowledge". Valuing free association and
imprecision, they call it an
"anataxonomy".

Of course, it's positioned as an alternative to
tree-like, or any kind of, structures. But, as
Valdis Krebs, a guru of data visualization, said
in comments to one of Hugh's posts, "It is not an
OR problem... hierarchy OR something else -- like
network. It is an AND situation... hierarchy AND
network -- prescribed AND emergent...".

I think it's categorical. That makes it a tree of
a very short sort, perhaps the height of moss.

What matters most is who is coming up with it. As
usual with cool things that happen naturally on
the Web, it's not the big vendors or other usual
suspects. It's individuals, trying to make sense
of the world.

Dollars, of course, will come later.

Doc Searls is Senior Editor of Linux
Journal
. He also presides over
Doc Searls' IT Garage,
which is published by SSC, the publisher of Linux
Journal
.

______________________

Doc Searls is Senior Editor of Linux Journal

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Technorati indexes HTML, RSS, and Atom

Niall Kennedy's picture

Services such as Technorati archive every post from every blog with an RSS feed.

Technorati indexes and archives every post from every blog even if that blog does not have a syndication feed. Technorati indexes a blog's HTML with assistance from RSS and Atom feeds. There are many blogs in existence with no associated feeds, or feeds only available as a paid feature. Technorati includes these feedless blogs in its database with each new ping we receive.

What if there's no ping?

Anonymous's picture

How about the blogs that offer RSS or Atom but don't ping Technorati?

Post new comment

  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.
  • Use to create page breaks.

More information about formatting options