AP Launches Open Source Ascribenation Project

by Doc Searls

What sounds like DRM is really a cool open source journalism tool.

That's my take-away from Associated Press to build news registry to protect content — a press release that went up on 23 July. After you get past the opening paragraphs, which are pure paranoidese...

NEW YORK – The Associated Press Board of Directors today directed The Associated Press to create a news registry that will tag and track all AP content online to assure compliance with terms of use. The system will register key identifying information about each piece of content that AP distributes as well as the terms of use of that content, and employ a built-in beacon to notify AP about how the content is used.

"What we are building here is a way for good journalism to survive and thrive," said Dean Singleton, chairman of the AP Board of Directors and vice chairman and CEO of MediaNews Group Inc. "The AP news registry will allow our industry to protect its content online, and will assure that we can continue to provide original, independent and authoritative journalism at a time when the world needs it more than ever."

... you get down into the cool stuff :

The registry will employ a microformat for news developed by AP and which was endorsed two weeks ago by the Media Standards Trust, a London-based nonprofit research and development organization that has called on news organizations to adopt consistent news formats for online content. The microformat will essentially encapsulate AP and member content in an informational “wrapper” that includes a digital permissions framework that lets publishers specify how their content is to be used online and which also supplies the critical information needed to track and monitor its usage.

Microformats are open source tools. As 's About page explains,

Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging).

Ars Technica does an excellent job explaining what AP's announcement (plus this diagram) really means:

The "wrappers" and "digital permissions frameworks" that it hopes to implement sound like tough encryption, but they aren't. Instead, the AP is simply relying on a newly-developed microformat called hNews. It's a simple HTML-based tagging scheme for marking up news content and making headlines, author names, and permitted uses machine-readable and search-engine friendly. (See an example.) hNews is funded by major foundations, and all of its tools and specs will be released as open-source software.

In what way does this scheme "wrap" and "protect" the news? It doesn't; it simply marks it up, and adding tags expressing a content creator's wishes on reuse has no bearing on someone's rights under US copyright law. What it does do is provide organizations that use hNews a way to release more rights than are granted under copyright—in essence, a sort of "Creative Commons" news license. In fact, hNews' "rights field" uses , the Creative Commons Rights Expression Language.

The AP's news registry will use hNews to embed some kind of Web beacon in news content as well, making it possible to track some uses of the story across the Web. Users who simply copy and paste parts of the story, or those who retype bits of it to use as quotes, or those of simply strip out the tags will of course not end up being tracked.

Let's go back to the AP release, which bears this subhead: Registry will provide tools to monitor use of AP and member content online while also enabling new business opportunities. The italics are theirs. The bold-face is mine. Because what's cool here is more than just the fact that the AP has borrowed heavily from open source work — and can contribute to it as well. It's that the AP is laying foundations of a new business model for journalism, starting with what we in the development community have been calling :

Ascribenation is the act of ascribing credit to a source or set of sources, for a piece of media work. That work might be a newspaper story, a blog post, or a song mashed from multiple sources or samples. It is derived from the verb ascribe.

Ascribenation is not possible without sources carrying ascribenation metadata that can be concatenated forward as each source's work is included in (or referenced by) other works. There are several means for doing this, including microformats and XDIs.

Ascribenation makes possible crediting sources for work, and getting those sources paid as well.

For example, here's a Houston Chronicle story on climate change lobbying that sources an analysis of climate change lobbying by the Center for Public Integrity. Right now the Chronicle doesn't have any links in that story. But if it did, that link might contain concatenated metadata that could be gathered by the on the reader's device (phone, laptop, whatever). Using , the reader could pay both the Chronicle and the Center for Public Integrity.

The AP has two routes it can take here:

  1. The paranoid route, looking toward their new system as a way to lock up content and enforce compliance.
  2. The engagement route, by which they recognize that they've just helped lay the foundation for the next generation of journalism, and a business model for it. That generation is one in which all journalists and sources get credit for their work throughout the networked world — and where readers, listeners and viewers can easily recognize (and cite) those responsible for the media goods they consume. The business model is one in which anybody consuming media "content" (a word I hate, but there it is) can pay whatever they want for anything they like, on their own terms and not just those of the seller.

Making money is a key motivation for the AP (and, of course, for its 1500 member newspapers). On the same day the AP's press release went out, it ran "AP setting up tracking system for Web content," by , an AP Business Writer. It begins,

The Associated Press is moving ahead with plans for a system to detect unlicensed use of its content and potentially create new ways for the 163-year-old news cooperative and other media to make more money on the Internet.

As part of a strategy approved Thursday by the AP's board, the cooperative will start by bundling its text stories in an "informational wrapper" that will include a built-in beacon to monitor where stories go on the Internet.

The beacon is meant to be a policing device aimed at deterring Web sites from posting AP content without paying licensing fees. The AP and its member newspapers contend unlicensed use of their material is costing them tens of millions of dollars in potential ad revenue.

Liedtke adds,

Just the word "beacon" could raise alarms, partly because Facebook Inc. used the same term to describe a 2007 program that automatically monitored its users' activities at other Web sites. After an uproar, Facebook decided to leave it up to its users to decide whether to turn its beacon on.

The AP will be able to determine what's being read on individual computers, but AP executives stressed the monitoring system won't collect personal information. Cookies — computer coding planted into Web browsers to determine their users' interests — won't be part of the AP's tracking system, Seagrave said.

"We want to know where stories are going and what is being read, not who is reading it," Seagrave said.

Right. But who besides the AP trusts what they're saying here? Once all this reading/tracking/crediting stuff starts getting exposed up and down the value chains, other journalists -- as well as readers -- have an incentive to avoid stories contaminated with AP's tracking beacons. Liedtke continues,

It could even be enough to discourage people from reading AP stories and instead lead them to other news sources, said John Palfrey, a law professor and co-director of Harvard's Berkman Center for Internet & Society.

"If people think that there's a greater likelihood that on an AP story, people could track down what they are reading, they are less likely to make the choice of that particular story rather than another story," Palfrey said. "This seems to me ... a potential third rail."

In other words, the beacon risks subtracting value rather than adding it. That's why AP needs to have its lawyers grab that third rail. Journalists and business folk can stand back.

With their lawyers out of the way, maybe the AP can see that what they have here are means for bringing in more money, but through attraction rather than coercion. To attract, the AP needs to open up and engage, not to send out beacons, put up walls, and sue people.

Alas, engagement isn't where they're coming from — yet. Adds Liedtke,

The AP board of directors argues something has to be done to protect its content because the cooperative's revenue is falling for the first time in years. Revenue is expected to be around $700 million this year, down from $748 million in 2008, in part because of reductions in the fees it charges newspapers and broadcasters, whose advertising revenue has been shriveling as more marketers shift to less expensive options online.

Advertising will continue to shrink. There are too many other ways for supply and demand to find each other and engage. Those ways will only increase in number and efficiency. Rather than fight the new systems that are only beginning to get built, the AP and its member newspapers should embrace those systems. Though I doubt they knew it, that's exactly what they did by adopting microformats and other open source tools. (Well, probably their engineers knew. Yoz Grahame has a post visiting that.)

The real challenge for the AP isn't to "protect its content." It's to make that content more valuable to more people and in more ways. It's to help create the 21st century ecosystem for journalism, rather than to protect its 19th century model. (The AP was founded in 1846.) A lot of us would like to help the AP, along with other journalistic organizations. But we can't do it through legal departments. We can't do it through CEOs and spokespeople either. We need to do it on a geek-to-geek level. Our geeks need to be talking to their geeks.

If anybody from the AP is reading this, let's talk.

Meanwhile, a small suggestion to the AP: Get your legal department's footprints off your home page (where the top item is "Protecting AP's Intellectual Property"). In fact, push them to the nether regions of the website. It's not friendly stuff.

One more: start linking. AP stories are typically bare and linkless. If you want others to link to your stuff, start linking to theirs. In "AP proposes new article formatting for the Web," Andrew Vanacore writes, "The Associated Press is proposing that publishers attach descriptive tags to news articles online in hopes of taming the free-for-all of news and information on the Web and generating more traffic for established media brands." In a subsequent paragraph, he writes, "As things stand, an awful lot of information on a news article is completely invisible," said Martin Moore, director of Media Standards Trust, which jointly developed the new rules with the AP. "A search engine is not able to tell a byline from someone who is referred to in an article." Sure, but there's no link to anything explaining tags (here's Wikipedia's), nor is there one to .

Oh, and quit trying to tame that free-for-all. That's how the Net and the Web got here, and it's where you need to place your next bets. Let us help.

Otherwise you'll just see more of this. (You probably will anyway, but it'll be easier to laugh.)

[Later...] has the best summary I've seen yet of What's Going On Here. He concludes,

So we have on the floor a proposal. From the perspective of building a better Internet, it's a good idea. From the perspective of stopping bad people from stealing, it's utterly ineffective. We should understand what it really does, and adopt it for what it really is, and drop the silly posturing about how it's going to make all our financial troubles vanish. Because it's not that, not at all. What it is, is a good thing.

Like I said.

Load Disqus comments