OpenOffice.org: The Limits of Readability and Grammar Extensions
OpenOffice.org: The limits of readability and grammar extensions
As a professional writer, my software needs are simple. Give me a text editor -- preferrably Bluefish, but vim or OpenOffice.org Writer will do -- and I have all I need.
However, judging by the number of aids available for writers, I am obviously in the minority. Novel-plotting databases, daily word counters, character generators -- if you can imagine the software, you can probably find at least one example. I am fascinated by all the ingenuity, but most of the time I conclude that, if you know enough to use any of these tools without them leading you into greater difficulties, you can do without them. The OpenOffice.org extensions Readability Report and Language Tool are two applications that illustrate my point perfectly.
Readability Report is an extension that reports on how your current document scores on standard tests for comprehension: the Flesch-Kincaid Easy Reading and Grade Level, the FOG Index, the SMOG Index, and the Automated Readability Test.
These tests all differ in details, but all of them look at such characteristics as the number of words per sentence, the number of complex and multi-syllable words to produce an approximate level of schooling the audience would need to understand a passage. In addition, Readability Report also includes its own Weirdness Metric, which gives an average sentence score, as well as reporting on the least and most readable sentence in the passage.
All these tests are available from a top-level menu, either in a brief report for the entire document, or a detailed report that gives test results for each paragraph. The reports open in separate documents, so that you can save or print them.
Aside from the fact that Readability Report is too minor a feature to place in its own top-level menu, nothing is wrong with the extension itself. But the tests it includes are all severely limited tools, suffering from an overly mechanical view of readability.
For one thing, none of the things the test measure are, in themselves, strong indications of readability. A sentence that is three or four lines long can be highly readable if it is properly punctuated and makes use of basic rhetorical tricks such as parallelism. Similarly, the readability of words depends less on the number of syllables than on how common they are; "impossibility," for instance, should not be lumped in with "duodecahedron" and is more widely understood than a short word like "gormless."
Moreover, the tests assume that, the lower the score, the more readable a document is. However, in practice, readability depends heavily on context and the audience. Write about mounting drives, devices, or filesystems to a general audience, and you risk being incomprehensible, but, use the same terms to an audience of free software users, and the same words will probably be understood by everyone.
Because of such limitations in the tests themselves, you could use Readability Report to pare down your word choice and sentence length until it was theoretically readable by a third grader and still not write adequately. Writing is a craft, not an art, so in the end such tests mean very little.
About the most you can say is that Readability Report's detailed analysis can tell you how your readability varies from paragraph to paragraph. I don't know about anyone else, but that seems too minor a benefit when you can get much the same results by constantly reminding yourself to write simply and clearly.
A grammar checker is one of the most requested features in OpenOffice.org, so LanguageTool's attempt to provide one makes it a popular extension. However, LanguageTool is not only incomplete in itself, but also fails to overcome the obstacle that makes every grammar checker I've ever seen inadequate -- the fact that in English, with its weak declensions and conjugations, knowing what part of speech a particular word might be is next to impossible except in context. And, unfortunately, LanguageTool is as blind to most context as Readability Report.
LanguageTool installs a sub-menu in the language section of the Tools menu. It runs with the spell-checker, underlining offending elements in blue if you have automatic checking turned on, or separately from its own sub-menu.
In the sub-menu, you can choose Configuration to see a list of the offenses that LanguageTool watches for. Many of the list items are not grammatical at all, so much as stylistic, such as starting a sentence with a capital letter or avoiding slang. Others items are common typos that could go into AutoCorrect. Yet, even here, LanguageTool is largely style deaf. It ignores, for example, the possibility that you might want to use redundant phrases like "the reason why" or "each and every one" for emphasis -- a widespread and perfectly acceptable habit in a language like English that has Germanic roots.
However, even within pure grammar, LanguageTool is weak. It catches subject-verb agreement only in specified cases, and, when it catches instances of using the wrong form of the verb "to be," it suggests that you use "be" as an alternative, leading you into an error. Pronoun reference and agreement in number are similarly inconsistent, while other elements such as faulty parallelism, are not mentioned at all. These lapses make LanguageTool, in its own way, as unreliable a feature as Readability Report.
Tempting, but not there yet
You may think that I am being too hard on these extensions. After all, readability and grammar are complex matters, and programming for them is difficult. Since spell-checkers are not infallible either, why should I be so negative about the effort to provide features that many readers want?
The answer is simple. A spell-checker does serve to catch the more obvious typos, and its limitations are well known. Most people who have spent any time around computers now know that after you run a spell-checker, you need to do additional proofreading.
By contrast, tools like Readability Report and LanguageTool present their findings with an air of objectivity. Users are likely to reason that if a readability test tells them that their document is clear, or a grammar checker flags an error, that the software must be right. Add a precise figure, the way that the readability tests do, and you can easily be seduced by the false sense of precision. The temptation to believe such things must be especially strong if English is not your first language or when you lack confidence in your writing ability.
However, any rule-based effort to improve your writing is going to be wrong a significant part of the time. Readability and grammar tools can be refined, but only to a limited extent. Both have been available in office suites for over twenty years, and neither is anywhere near as reliable as an experienced editor. By now, it seems likely that they never will be until we develop human-level artificial intelligence.
The idea of a shortcut is tempting, which is why such tools are so tempting. Sadly, though, none of them are a substitute for skill and personal knowledge -- and certainly not Readability Report or LanguageTool.
Bruce Byfield (nanday)
- Papa's Got a Brand New NAS
- Applied Expert Systems, Inc.'s CleverView for TCP/IP on Linux
- Panther MPC, Inc.'s Panther Alpha
- Simplenote, Simply Awesome!
- Rogue Wave Software's TotalView for HPC and CodeDynamics
- Returning Values from Bash Functions
- The Tiny Internet Project, Part III
- Jetico's BestCrypt Container Encryption for Linux
- NethServer: Linux without All That Linux Stuff
- GENIVI Alliance's GENIVI Vehicle Simulator