Convert diff output to colorized HTML

August 27th, 2008 by Mitch Frazier in

Your rating: None Average: 3.8 (12 votes)

If you search the web you can find a number of references to programs/scripts that convert diff output to HTML. This is a bash version.

The script expects "unified" diff output (diff -u) on its standard input and produces a self-contained colorized HTML page on its standard output. Consider the two files:

#include <stdio.h>

// main
main()
{
    printf("Hello world\n");
}
and
#include <stdio.h>

main()
{
    printf("Hello World\n");
    printf("Goodbye cruel world\n");
}
Running diff on the two files and piping that through diff2html produces a colorized HTML page:
  $ diff -u v1.c v2.c | diff2html >v1-v2.html
The resulting page is:
--- v1.c	2008-08-27 13:04:40.000000000 -0500
+++ v2.c	2008-08-27 13:04:29.000000000 -0500
@@ -1,8 +1,8 @@
 #include <stdio.h>
 
-// main
 main()
 {
-	printf("Hello world\n");
+	printf("Hello World\n");
+	printf("Goodbye cruel world\n");
 }
 

The script follows:

#!/bin/bash
#
# Convert diff output to colorized HTML.

cat <<XX
<html>
<head>
<title>Colorized Diff</title>
</head>
<style>
.diffdiv  { border: solid 1px black;           }
.comment  { color: gray;                       }
.diff     { color: #8A2BE2;                    }
.minus3   { color: blue;                       }
.plus3    { color: maroon;                     }
.at2      { color: lime;                       }
.plus     { color: green; background: #E7E7E7; }
.minus    { color: red;   background: #D7D7D7; }
.only     { color: purple;                     }
</style>
<body>
<pre>
XX

echo -n '<span class="comment">'

first=1
diffseen=0
lastonly=0

OIFS=$IFS
IFS='
'

# The -r option keeps the backslash from being an escape char.
read -r s

while [[ $? -eq 0 ]]
do
    # Get beginning of line to determine what type
    # of diff line it is.
    t1=${s:0:1}
    t2=${s:0:2}
    t3=${s:0:3}
    t4=${s:0:4}
    t7=${s:0:7}

    # Determine HTML class to use.
    if    [[ "$t7" == 'Only in' ]]; then
        cls='only'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            if [[ $lastonly -eq 0 ]]; then
                echo "</div>"
            fi
        fi
        if [[ $lastonly -eq 0 ]]; then
            echo "<div class='diffdiv'>"
        fi
        lastonly=1
    elif [[ "$t4" == 'diff' ]]; then
        cls='diff'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            echo "</div>"
        fi
        echo "<div class='diffdiv'>"
        lastonly=0
    elif  [[ "$t3" == '+++'  ]]; then
        cls='plus3'
        lastonly=0
    elif  [[ "$t3" == '---'  ]]; then
        cls='minus3'
        lastonly=0
    elif  [[ "$t2" == '@@'   ]]; then
        cls='at2'
        lastonly=0
    elif  [[ "$t1" == '+'    ]]; then
        cls='plus'
        lastonly=0
    elif  [[ "$t1" == '-'    ]]; then
        cls='minus'
        lastonly=0
    else
        cls=
        lastonly=0
    fi

    # Convert &, <, > to HTML entities.
    s=$(sed -e 's/\&/\&amp;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g' <<<"$s")
    if [[ $first -eq 1 ]]; then
        first=0
    else
        echo
    fi

    # Output the line.
    if [[ "$cls" ]]; then
        echo -n '<span class="'${cls}'">'${s}'</span>'
    else
        echo -n ${s}
    fi
    read -r s
done
IFS=$OIFS

if [[ $diffseen -eq 0  &&  $onlyseen -eq 0 ]]; then 
    echo -n '</span>'
else
    echo "</div>"
fi
echo

cat <<XX
</pre>
</body>
</html>
XX

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

__________________________

Mitch Frazier is an Associate Editor for Linux Journal and the Web Editor for linuxjournal.com.


Special Magazine Offer -- Free Gift with Subscription
Receive a free digital copy of Linux Journal's System Administration Special Edition as well as instant online access to current and past issues. CLICK HERE for offer

Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
jim3e8's picture

Thanks Richie

On November 24th, 2008 jim3e8 (not verified) says:

Your script was useful. The IFS should be set to newline though so that tabs are not collapsed.

It is also nice to additionally accept diff input on stdin as in Mitch's original script so one can colorize output from svn/hg/cvs diff via a pipe. Just replace your "diff -u $@" with a variable called $CMD, and set $CMD to "cat" when no args are provided.

monitoring's picture

Interesting post,thanks.Good

On September 4th, 2008 monitoring (not verified) says:

Interesting post,thanks.Good website.

Anonymous's picture

Or use vim

On September 4th, 2008 Anonymous (not verified) says:

You can also open the diff in Vim, choose your preferred colorscheme, and then use :TOhtml.

Richie's picture

Not another one!

On September 3rd, 2008 Richie (not verified) says:

I first thought this was going to be a pure output colorizer for diff. When I read on and realized it was yet another converter to generate HTML from diff output, I was disappointed. "Never mind", I thought, "I'll just write my own." Based heavily on the above program this produces pretty much the same output, but in the console, using ANSI color control sequences.

#!/bin/bash
#
# colorize diff output for ANSI terminals
# based on "diff2html" 
# (http://www.linuxjournal.com/content/convert-diff-output-colorized-html)

# absolute color definitions
NOCOL="\e[0m"
BOLD="\e[1m"
RED="\e[31m"
GREEN="\e[32m"
PINK="\e[35m"
CYAN="\e[36m"
WHITE="\e[37m"

# style color definitions
C_COMMENT=$WHITE
C_DIFF=$RED
C_OLDFILE=$CYAN
C_NEWFILE=$RED
C_STATS=$GREEN
C_OLD=$BOLD$RED
C_NEW=$BOLD$GREEN
C_ONLY=$PINK

# check args
[[ $# -lt 2 ]] && echo "Usage: $0  " && exit 1

# The -r option keeps the backslash from being an escape char.
diff -u $@ | while read -r s ; do
	# determine line color
	if [[ "${s:0:7}" == 'Only in' ]]; then color=$C_ONLY
	elif  [[ "${s:0:4}" == 'diff' ]]; then color=$C_DIFF
	elif  [[ "${s:0:3}" == '---'  ]]; then color=$C_OLDFILE
	elif  [[ "${s:0:3}" == '+++'  ]]; then color=$C_NEWFILE
	elif  [[ "${s:0:2}" == '@@'   ]]; then color=$C_STATS
	elif  [[ "${s:0:1}" == '+'    ]]; then color=$C_NEW
	elif  [[ "${s:0:1}" == '-'    ]]; then color=$C_OLD
	else color=
	fi

	# Output the line.
	if [[ "$color" ]]; then
		printf "$color"
		echo -n $s
		printf "$NOCOL\n"
	else echo $s
	fi
done

Tinker with the "style color definitions" to change colors to suit your environment.

In the course of messing with this, I made a couple of improvements that could be rolled back into the original version: 1. only one call to "read", and 2. "diff -u" is now called internally.

I tried not changing IFS, and it seemed to work, so I left those lines out, too. The other thing the original could benefit from is properly named CSS classes! (What on earth do "plus3" and "at2" mean?) As you can see, I've given more meaningful names to my ANSI equivalents.

gliks's picture

colordiff

On April 22nd, 2009 gliks (not verified) says:

Hey Richie,

you could just use an existing tool colordiff

sudo apt-get colordiff

great script though.

Cheers.

Anonymous's picture

tkdiff

On August 27th, 2008 Anonymous (not verified) says:

Perhaps more interactively helpful is a graphical diff viewer such as tkdiff.

Post new comment

Please note that comments may not appear immediately, so there is no need to repost your comment.
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <pre> <ul> <ol> <li> <dl> <dt> <dd> <i> <b>
  • Lines and paragraphs break automatically.

More information about formatting options

Newsletter

Each week Linux Journal editors will tell you what's hot in the world of Linux. You will receive late breaking news, technical tips and tricks, and links to in-depth stories featured on www.linuxjournal.com.
Sign up for our Email Newsletter

Tech Tip Videos

From the Magazine

July 2009, #183

News Flash: Linux Kernel 3.0 to include an on-the-go Expresso machine interface! Ok, maybe not, but Linux is definitely going mobile, from phones to e-readers. Find out more inside about Android, the Kindle 2, the Western Digital MyBook II, The Bug, and Indamixx (a portable recording studio). And if you've gone mobile and you been wanting more Emacs in your life then check out Conkeror.


To compliment the mobile we've got the stationary: parsing command line options with getopt, checking your Ruby code with metric_fu, and building a secure Squid proxy. How is this stationary you ask? What can we say? It's not. We just wanted to see if anybody actually read this part of the page :) .


All this and more, and all you have to do is get your hot sweaty hands on the latest copy of Linux Journal.





Read this issue