Testing Godwin's Law

Last week I came across Godwin's Law. Many of you may already be familiar with it. For those of you who aren't, Godwin's Law, according to wikipedia, states:

As an online discussion grows longer, the probability of a comparison involving Nazis or Hitler approaches one.

To test this law I wrote a little program to look at comments in the drupal database that contains the Linux Journal web content. The script first gets all the distinct article ids (nids in drupal speak) from the comments table where the comment includes one of the key words. Then it gets a count of the number of comments for each article, sorts the list, and outputs the count along with the article title.

Here's the script:

#!/bin/bash

source mysqlpwd.sh

echo "<table border=\"1\">"
echo "<thead>"
echo "<tr><th># Comments</th><th>Article Title</th></tr>"
echo "</thead>"
echo "<tbody>"
mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
    -e "SELECT DISTINCT nid FROM comments WHERE comment LIKE '%hitler%' OR comment LIKE '%nazi%'" |
    while read nid
    do
        if [[ $nid != 'nid' ]]; then
            count=$(mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
                -e "SELECT COUNT(*) AS cnt FROM comments WHERE nid = $nid")
            count=$(echo ${count/cnt/})
            printf "%d\t%d\n" $count $nid
        fi
    done | sort --numeric --reverse |
        while read count nid
        do
            t=$(mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
                -e "select title from node where nid = $nid")
            t=$(echo ${t/title/})
            printf "<tr><td>%d</td><td><a href=\"/node/%d\">%s</a></td></tr>\n" $count $nid "$t"
        done
echo "</tbody>"
echo "</table>"


Here's the output:

# CommentsArticle Title
259A five year deal with Microsoft to dump Novell/SUSE
205Penguins for President?
192My Visit to SCO
178Saving the Net
165Obsolete Microkernel Dooms Mac OS X to Lag Linux in Performance
123Software Freedom for Macedonia?
59Why I Don't Use the GPL
58What Can't Open Source Achieve in the Next 10 Years?
56A Penguin Angle on the Ox: Day One at Macworld
55Miguel de Icaza plays fast and loose with the facts and history
37Linux in Government: Linux Desktop Reviews, Part 6 - Ubuntu
32Looking for Answers
31Bit Prepared: A Missing Link?
13Scientology Lawyer Promises to Continue "Appropriate Action"
12Hey Microsoft, Sue Me First
7Linux in Government: Understanding Federated Identity Management
5Open Source Radio


For comparison we need to get a list of articles that contain a large number of comments but that do not contain the keywords. Now we look at each article, then we check to see if it has any comments that contain the keywords. If it does not then we count the number of comments and output the article if the associated discussion was long. I arbitrarily chose 100 as the minimum number of comments needed for a discussion to be classified as long.

Here's the script:

#!/bin/bash

source mysqlpwd.sh

echo "<table border=\"1\">"
echo "<thead>"
echo "<tr><th># Comments</th><th>Article Title</th></tr>"
echo "</thead>"
echo "<tbody>"
mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
    -e "SELECT nid FROM node" |
    while read nid
    do
        if [[ $nid != 'nid' ]]; then
            count=$(mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
                    -e "SELECT COUNT(*) as cnt FROM comments WHERE nid = $nid AND
                        (comment LIKE '%hitler%' OR comment LIKE '%nazi%')")
            count=$(echo ${count/cnt/})
            if [[ $count -eq 0 ]]; then
                count=$(mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
                    -e "SELECT COUNT(*) AS cnt FROM comments WHERE nid = $nid")
                count=$(echo ${count/cnt/})
                if [[ $count -gt 100 ]]; then
                    printf "%d\t%d\n" $count $nid
                fi
            fi
        fi
    done | sort --numeric --reverse |
        while read count nid
        do
            t=$(mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD drupal_lj \
                -e "select title from node where nid = $nid")
            t=$(echo ${t/title/})
            printf "<tr><td>%d</td><td><a href=\"/node/%d\">%s</a></td></tr>\n" $count $nid "$t"
        done
echo "</tbody>"
echo "</table>"


Here's the output:

# CommentsArticle Title
423Why Python?
244Getting a Windows Refund in California Small Claims Court
222Perceptions of the Linux OS Among Undergraduate System Administrators
192GNU/Linux DVD Player Review
160The Toshiba Standoff
153What Application Do You Want Ported to Linux?
127SCO to Reveal Allegedly Copied Code
127Linux from Kindergarten to High School
124Red Hat 7.3 beta: A Product Review
108Gentoo for All the Unusual Reasons
105Boot with GRUB
101The Great Software Schism


Can we conclude anything about the validity of Godwin's Law from this? Ahhh, I dunno.

p.s. File this under blather #2.