Andamooka: Open Support for Open Content
A little over a year ago I began writing a book about developing KDE applications. The KDE project was introducing many new APIs and subsystems into the development code that would eventually become KDE 2.0, and developers would need documentation to keep pace with all of this new technology. But how could a book, which needs to be written, edited, reviewed and printed (with some of these steps occuring several times), be made available before the software about which it is written changes? Half of the solution was to enlist as contributing authors several talented developers who were designing and building the new technologies and to keep us all working in parallel from the then-current code base (the “bleeding edge” code). The other half, use of the Open Publication License, has lead to the development of Andamooka and the concept of open support.
The goal of Andamooka (a web-based reader support system, http://www.andamooka.org/) is to keep a book current and correct through the use of coarse-grained community annotation of the on-line text. Annotation is public and is itself open content, and so can be redistributed along with the original text in, ideally, more accurate and up-to-date annotated versions of the text.
Open-content, using the term generally, is any content (typically textual, but not necessarily excluding sound, images and other types of “content” you might wish to consider) that can be freely modified and redistributed. If this reminds you of open-source software, it is, because that was its inspiration.
Reasons for attempting to apply the open-source software development model to open content are described in “Open Source Content Development” by David Wiley (opencontent.org/bazaar.shtml), “Why You Should Use the GNU FDL” by Eric Raymond (www.onu.org/philosophy/why-gfdl.html) and “Do Open-Source Books Work?” by Benjamin Crowell (www.lightandmatter.com/article/article.html), but allow me to try to summarize. First of all, the open-source software development model:
Puts software in the hands of, more or less, whoever wants it because the software is free.
Creates more niche software because open-source software may be freely modified.
Produces more robust software because the software gets tested by many different users in many different situations.
Produces software with features that are in demand because lots of feedback is received from users and, often enough, users contribute by doing some development.
These four items can potentially translate well into open-content development.
Writers can reach more readers by giving their work away—and, indeed, writers want to be read. In the special case where the content is documentation for open-source software, it becomes imperative that the software documentation allow itself to be modified and redistributed so that new documentation can be derived from old documentation when new software is derived from old software.
We can consider content to be robust if it is accurate and comprehensible. The levels of accuracy and comprehensibility of content increase when reader feedback is incorporated into new versions. In other words, an open-content development model presents the image of many readers-as-editors, each focusing on the aspects of the work that interests them.
Finally, if there are many readers of a work, there will be lots of feedback that points out what is missing. This gives the author (or authors) the information they need to plan new versions of the work. If readers are welcome to participate in the authoring, they might write and submit corrected or new portions of the work, thus distributing the workload and potentially creating something better than one person (or a small group of people) could have.
Open content licenses include the Open Publication License (www.opencontent.org/openpub), the GNU Free Documentation License (www.gnu.org/copyleft/fdl/html) and the Linux Documentation Project boilerplate (www.linuxdoc.org/manifesto.html). Each of these allows the work to which it is applied to be modified and redistributed. The first of these has provisions for placing restrictions on the distribution of the work in printed format. Such a restriction could help a publishing company recoup the cost of creating the book (a process that involves many person-hours of labor and, of course, a lot of paper). I personally recommend that if companies choose to include this restriction when a book is initially published that they later remove it, for example, after development and other costs are recouped or after a predetermined period of time has elapsed. Allowing the creation of paper versions of a work (i.e., excluding the optional restriction) might help it to reach people in markets where it would not normally be distributed.
It should be clear by now that open-content licensing has the potential to produce high-quality content. Today there are already many open-content books available on the Web, such as Grokking the GIMP by Carey Bunks; GTK+/GNOME Application Development by Havoc Pennington; DocBook: The Definitive Guide by Norman Walsh; and, Leonard Muellner and; of course, KDE 2.0 Development by David Sweet, et. al., but authors could do much more than simply post the text of their books if only the proper tools were available.
The Andamooka web site offers some of these tools, for free, to authors. Authors can post their book for on-line reading and annotating and for download in various formats. They can run announcements, news, questions, etc., and offer other support materials, such as source code or related documents to users. Using these tools, an author can offer support to his/her readers by interacting directly with them and by providing a forum in which readers with a common interest—after all, they're all reading the same book!—can interact. This situation differs from a less-structured public forum (like a usenet newsgroup, for example) in that the book can facilitate interaction by providing the readers with a common language in which to discuss the subject as well as a framework around which to organize their discussion.
The heart of Andamooka is annotation. The full text of each book is displayed with annotation at the end of each section. This is how the structure just discussed is realized. Annotation may be added and viewed by anyone visiting the site. The annotation is submitted under the Open Publication License so that it may be redistributed along with the book. Readers can read an annotated version of the book on-line or download it to read on their computer or to print out at home.
When collecting a large amount of information, as the community annotation for a book might become, one has to consider ways to maintain and encourage quality. Andamooka has an experimental moderation and points (or Karma, à la Slashdot) system based on the SlashCode system. A score is assigned to each piece of annotation based on random, occasional polling of readers. When polled, a reader can choose one of three ratings (let's call the rating r): good (r=3), neutral (r=2) or bad (r=1). For the purpose of ranking the annotations for viewing, attempting to put the “best” at the top, a score (call it S) is assigned to each article. If we call N the number of times an article has been rated, then the score is: S=SQRT(N)*AVG(r)/STDDEV(r). In plain English, this means that an annotation is rated higher if:
More people have rated (and, presumably, read) it (SQRT(N))
People have, on average, given it a better rating (AVG(r))
People are in greater agreement about the quality of the annotation (1/STDDEV(r))
To encourage quality submissions, users are given points—or “Karma”--based on how the annotation they write is rated by the community and, possibly, based on how much they contribute and how recently they have contributed. These features have yet to be fully implemented.
The hope is that this system will serve as a community editing, offering a way for readers to sort out the best annotation. As Andamooka is in its infancy, I cannot say whether this system is successful or not.
The next step is the creation of an open-development system. The model, as it is discussed today, derives from open-source software development (as mentioned above) but has some different issues to contend with. Those interested in open development would like to create a book, or other work, by having many contributors plan, write, edit, format, etc., in hopes that the workload will be distributed, the work will be of high quality and everyone will enjoy what they are doing (since it is volunteer work). Similar goals have been achieved in the open-source software world, so what's different about open content?
Authors may not be as technically savvy as software developers (note that by “authors” I don't necessarily mean authors of technical books). So, unlike software developers, they can't necessarily make use of software-project management tools (for example, CVS) that are difficult to install, maintain, and/or used to develop a book, however capable those tools might be. Thus, there is a need for user-friendly project management and collaboration tools. The TWiki web-based application (http://twiki.sourceforge.net/) and its Wiki Wiki kin are good examples of these kinds of tools, but they need to be expanded upon to be appropriate for open-content development. The members (myself included) of the FreeBooks Project (http://freebooks.myip.org/), founded by Benjamin Crowell, are exploring designs for the needed tools and methods for managing this type of project (for example, we are testing a method of incorporating DocBook document collaboration into TWiki). In fact, this exploration process has been organized around writing a free book about writing free books. (You are encouraged to participate. Please visit the web site for details.)
One also has to wonder whether an open, community-development model for content will attract contributors the way similar software projects do. So far, several people are contributing to the FreeBooks Project, and translations of KDE 2.0 Development into five other languages are underway. These numbers may not be as impressive as those used when discussing open-source software, but these projects have only just begun, and I consider the level of participation to be very encouraging.
Andamooka is a first step toward full open-content development. The books we see on the site now, and probably those we'll see in the near future, were written in a closed manner but are being made available under open licenses that allow modification and redistribution. By making use of Andamooka, authors can offer readers direct author support and the support of rest of the readership. Together the authors and readers can create a base of knowledge that can be culled and transformed—by any community participants—into a better and better book.
I plan to continue work on KDE 2.0 Development using the Andamooka system. I will encourage users to offer feedback and contribute to development by writing, editing, translating, reporting errors and suggesting new topics. This work will be released later in free, electronic editions of the book. As such, KDE 2.0 Development serves as a test case for open-content development.
David Sweet (www.andamooka.org/~dsweet) received his PhD in Physics/Chaos Theory from the University of Maryland. His focus has since shifted from the chaos of Christmas ornaments to the sheer noise of the securities markets. His work has appeared in Nature, Linux Journal, Physical Review Letters and in a few bookstores.