Convert diff output to colorized HTML

 in

If you search the web you can find a number of references to programs/scripts that convert diff output to HTML. This is a bash version.

The script expects "unified" diff output (diff -u) on its standard input and produces a self-contained colorized HTML page on its standard output. Consider the two files:

#include <stdio.h>

// main
main()
{
    printf("Hello world\n");
}
and
#include <stdio.h>

main()
{
    printf("Hello World\n");
    printf("Goodbye cruel world\n");
}
Running diff on the two files and piping that through diff2html produces a colorized HTML page:
  $ diff -u v1.c v2.c | diff2html >v1-v2.html
The resulting page is:
--- v1.c	2008-08-27 13:04:40.000000000 -0500
+++ v2.c	2008-08-27 13:04:29.000000000 -0500
@@ -1,8 +1,8 @@
 #include <stdio.h>
 
-// main
 main()
 {
-	printf("Hello world\n");
+	printf("Hello World\n");
+	printf("Goodbye cruel world\n");
 }
 

The script follows:

#!/bin/bash
#
# Convert diff output to colorized HTML.

cat <<XX
<html>
<head>
<title>Colorized Diff</title>
</head>
<style>
.diffdiv  { border: solid 1px black;           }
.comment  { color: gray;                       }
.diff     { color: #8A2BE2;                    }
.minus3   { color: blue;                       }
.plus3    { color: maroon;                     }
.at2      { color: lime;                       }
.plus     { color: green; background: #E7E7E7; }
.minus    { color: red;   background: #D7D7D7; }
.only     { color: purple;                     }
</style>
<body>
<pre>
XX

echo -n '<span class="comment">'

first=1
diffseen=0
lastonly=0

OIFS=$IFS
IFS='
'

# The -r option keeps the backslash from being an escape char.
read -r s

while [[ $? -eq 0 ]]
do
    # Get beginning of line to determine what type
    # of diff line it is.
    t1=${s:0:1}
    t2=${s:0:2}
    t3=${s:0:3}
    t4=${s:0:4}
    t7=${s:0:7}

    # Determine HTML class to use.
    if    [[ "$t7" == 'Only in' ]]; then
        cls='only'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            if [[ $lastonly -eq 0 ]]; then
                echo "</div>"
            fi
        fi
        if [[ $lastonly -eq 0 ]]; then
            echo "<div class='diffdiv'>"
        fi
        lastonly=1
    elif [[ "$t4" == 'diff' ]]; then
        cls='diff'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            echo "</div>"
        fi
        echo "<div class='diffdiv'>"
        lastonly=0
    elif  [[ "$t3" == '+++'  ]]; then
        cls='plus3'
        lastonly=0
    elif  [[ "$t3" == '---'  ]]; then
        cls='minus3'
        lastonly=0
    elif  [[ "$t2" == '@@'   ]]; then
        cls='at2'
        lastonly=0
    elif  [[ "$t1" == '+'    ]]; then
        cls='plus'
        lastonly=0
    elif  [[ "$t1" == '-'    ]]; then
        cls='minus'
        lastonly=0
    else
        cls=
        lastonly=0
    fi

    # Convert &, <, > to HTML entities.
    s=$(sed -e 's/\&/\&amp;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g' <<<"$s")
    if [[ $first -eq 1 ]]; then
        first=0
    else
        echo
    fi

    # Output the line.
    if [[ "$cls" ]]; then
        echo -n '<span class="'${cls}'">'${s}'</span>'
    else
        echo -n ${s}
    fi
    read -r s
done
IFS=$OIFS

if [[ $diffseen -eq 0  &&  $onlyseen -eq 0 ]]; then 
    echo -n '</span>'
else
    echo "</div>"
fi
echo

cat <<XX
</pre>
</body>
</html>
XX

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks Richie

jim3e8's picture

Your script was useful. The IFS should be set to newline though so that tabs are not collapsed.

It is also nice to additionally accept diff input on stdin as in Mitch's original script so one can colorize output from svn/hg/cvs diff via a pipe. Just replace your "diff -u $@" with a variable called $CMD, and set $CMD to "cat" when no args are provided.

Interesting post,thanks.Good

monitoring's picture

Interesting post,thanks.Good website.

Or use vim

Anonymous's picture

You can also open the diff in Vim, choose your preferred colorscheme, and then use :TOhtml.

Not another one!

Richie's picture

I first thought this was going to be a pure output colorizer for diff. When I read on and realized it was yet another converter to generate HTML from diff output, I was disappointed. "Never mind", I thought, "I'll just write my own." Based heavily on the above program this produces pretty much the same output, but in the console, using ANSI color control sequences.

#!/bin/bash
#
# colorize diff output for ANSI terminals
# based on "diff2html" 
# (http://www.linuxjournal.com/content/convert-diff-output-colorized-html)

# absolute color definitions
NOCOL="\e[0m"
BOLD="\e[1m"
RED="\e[31m"
GREEN="\e[32m"
PINK="\e[35m"
CYAN="\e[36m"
WHITE="\e[37m"

# style color definitions
C_COMMENT=$WHITE
C_DIFF=$RED
C_OLDFILE=$CYAN
C_NEWFILE=$RED
C_STATS=$GREEN
C_OLD=$BOLD$RED
C_NEW=$BOLD$GREEN
C_ONLY=$PINK

# check args
[[ $# -lt 2 ]] && echo "Usage: $0  " && exit 1

# The -r option keeps the backslash from being an escape char.
diff -u $@ | while read -r s ; do
	# determine line color
	if [[ "${s:0:7}" == 'Only in' ]]; then color=$C_ONLY
	elif  [[ "${s:0:4}" == 'diff' ]]; then color=$C_DIFF
	elif  [[ "${s:0:3}" == '---'  ]]; then color=$C_OLDFILE
	elif  [[ "${s:0:3}" == '+++'  ]]; then color=$C_NEWFILE
	elif  [[ "${s:0:2}" == '@@'   ]]; then color=$C_STATS
	elif  [[ "${s:0:1}" == '+'    ]]; then color=$C_NEW
	elif  [[ "${s:0:1}" == '-'    ]]; then color=$C_OLD
	else color=
	fi

	# Output the line.
	if [[ "$color" ]]; then
		printf "$color"
		echo -n $s
		printf "$NOCOL\n"
	else echo $s
	fi
done

Tinker with the "style color definitions" to change colors to suit your environment.

In the course of messing with this, I made a couple of improvements that could be rolled back into the original version: 1. only one call to "read", and 2. "diff -u" is now called internally.

I tried not changing IFS, and it seemed to work, so I left those lines out, too. The other thing the original could benefit from is properly named CSS classes! (What on earth do "plus3" and "at2" mean?) As you can see, I've given more meaningful names to my ANSI equivalents.

colordiff

gliks's picture

Hey Richie,

you could just use an existing tool colordiff

sudo apt-get colordiff

great script though.

Cheers.

tkdiff

Anonymous's picture

Perhaps more interactively helpful is a graphical diff viewer such as tkdiff.

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot