Convert diff output to colorized HTML

 in

If you search the web you can find a number of references to programs/scripts that convert diff output to HTML. This is a bash version.

The script expects "unified" diff output (diff -u) on its standard input and produces a self-contained colorized HTML page on its standard output. Consider the two files:

#include <stdio.h>

// main
main()
{
    printf("Hello world\n");
}
and
#include <stdio.h>

main()
{
    printf("Hello World\n");
    printf("Goodbye cruel world\n");
}
Running diff on the two files and piping that through diff2html produces a colorized HTML page:
  $ diff -u v1.c v2.c | diff2html >v1-v2.html
The resulting page is:
--- v1.c	2008-08-27 13:04:40.000000000 -0500
+++ v2.c	2008-08-27 13:04:29.000000000 -0500
@@ -1,8 +1,8 @@
 #include <stdio.h>
 
-// main
 main()
 {
-	printf("Hello world\n");
+	printf("Hello World\n");
+	printf("Goodbye cruel world\n");
 }
 

The script follows:

#!/bin/bash
#
# Convert diff output to colorized HTML.

cat <<XX
<html>
<head>
<title>Colorized Diff</title>
</head>
<style>
.diffdiv  { border: solid 1px black;           }
.comment  { color: gray;                       }
.diff     { color: #8A2BE2;                    }
.minus3   { color: blue;                       }
.plus3    { color: maroon;                     }
.at2      { color: lime;                       }
.plus     { color: green; background: #E7E7E7; }
.minus    { color: red;   background: #D7D7D7; }
.only     { color: purple;                     }
</style>
<body>
<pre>
XX

echo -n '<span class="comment">'

first=1
diffseen=0
lastonly=0

OIFS=$IFS
IFS='
'

# The -r option keeps the backslash from being an escape char.
read -r s

while [[ $? -eq 0 ]]
do
    # Get beginning of line to determine what type
    # of diff line it is.
    t1=${s:0:1}
    t2=${s:0:2}
    t3=${s:0:3}
    t4=${s:0:4}
    t7=${s:0:7}

    # Determine HTML class to use.
    if    [[ "$t7" == 'Only in' ]]; then
        cls='only'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            if [[ $lastonly -eq 0 ]]; then
                echo "</div>"
            fi
        fi
        if [[ $lastonly -eq 0 ]]; then
            echo "<div class='diffdiv'>"
        fi
        lastonly=1
    elif [[ "$t4" == 'diff' ]]; then
        cls='diff'
        if [[ $diffseen -eq 0 ]]; then
            diffseen=1
            echo -n '</span>'
        else
            echo "</div>"
        fi
        echo "<div class='diffdiv'>"
        lastonly=0
    elif  [[ "$t3" == '+++'  ]]; then
        cls='plus3'
        lastonly=0
    elif  [[ "$t3" == '---'  ]]; then
        cls='minus3'
        lastonly=0
    elif  [[ "$t2" == '@@'   ]]; then
        cls='at2'
        lastonly=0
    elif  [[ "$t1" == '+'    ]]; then
        cls='plus'
        lastonly=0
    elif  [[ "$t1" == '-'    ]]; then
        cls='minus'
        lastonly=0
    else
        cls=
        lastonly=0
    fi

    # Convert &, <, > to HTML entities.
    s=$(sed -e 's/\&/\&amp;/g' -e 's/</\&lt;/g' -e 's/>/\&gt;/g' <<<"$s")
    if [[ $first -eq 1 ]]; then
        first=0
    else
        echo
    fi

    # Output the line.
    if [[ "$cls" ]]; then
        echo -n '<span class="'${cls}'">'${s}'</span>'
    else
        echo -n ${s}
    fi
    read -r s
done
IFS=$OIFS

if [[ $diffseen -eq 0  &&  $onlyseen -eq 0 ]]; then 
    echo -n '</span>'
else
    echo "</div>"
fi
echo

cat <<XX
</pre>
</body>
</html>
XX

# vim: tabstop=4: shiftwidth=4: noexpandtab:
# kate: tab-width 4; indent-width 4; replace-tabs false;

______________________

Mitch Frazier is an Associate Editor for Linux Journal.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Thanks Richie

jim3e8's picture

Your script was useful. The IFS should be set to newline though so that tabs are not collapsed.

It is also nice to additionally accept diff input on stdin as in Mitch's original script so one can colorize output from svn/hg/cvs diff via a pipe. Just replace your "diff -u $@" with a variable called $CMD, and set $CMD to "cat" when no args are provided.

Interesting post,thanks.Good

monitoring's picture

Interesting post,thanks.Good website.

Or use vim

Anonymous's picture

You can also open the diff in Vim, choose your preferred colorscheme, and then use :TOhtml.

Not another one!

Richie's picture

I first thought this was going to be a pure output colorizer for diff. When I read on and realized it was yet another converter to generate HTML from diff output, I was disappointed. "Never mind", I thought, "I'll just write my own." Based heavily on the above program this produces pretty much the same output, but in the console, using ANSI color control sequences.

#!/bin/bash
#
# colorize diff output for ANSI terminals
# based on "diff2html" 
# (http://www.linuxjournal.com/content/convert-diff-output-colorized-html)

# absolute color definitions
NOCOL="\e[0m"
BOLD="\e[1m"
RED="\e[31m"
GREEN="\e[32m"
PINK="\e[35m"
CYAN="\e[36m"
WHITE="\e[37m"

# style color definitions
C_COMMENT=$WHITE
C_DIFF=$RED
C_OLDFILE=$CYAN
C_NEWFILE=$RED
C_STATS=$GREEN
C_OLD=$BOLD$RED
C_NEW=$BOLD$GREEN
C_ONLY=$PINK

# check args
[[ $# -lt 2 ]] && echo "Usage: $0  " && exit 1

# The -r option keeps the backslash from being an escape char.
diff -u $@ | while read -r s ; do
	# determine line color
	if [[ "${s:0:7}" == 'Only in' ]]; then color=$C_ONLY
	elif  [[ "${s:0:4}" == 'diff' ]]; then color=$C_DIFF
	elif  [[ "${s:0:3}" == '---'  ]]; then color=$C_OLDFILE
	elif  [[ "${s:0:3}" == '+++'  ]]; then color=$C_NEWFILE
	elif  [[ "${s:0:2}" == '@@'   ]]; then color=$C_STATS
	elif  [[ "${s:0:1}" == '+'    ]]; then color=$C_NEW
	elif  [[ "${s:0:1}" == '-'    ]]; then color=$C_OLD
	else color=
	fi

	# Output the line.
	if [[ "$color" ]]; then
		printf "$color"
		echo -n $s
		printf "$NOCOL\n"
	else echo $s
	fi
done

Tinker with the "style color definitions" to change colors to suit your environment.

In the course of messing with this, I made a couple of improvements that could be rolled back into the original version: 1. only one call to "read", and 2. "diff -u" is now called internally.

I tried not changing IFS, and it seemed to work, so I left those lines out, too. The other thing the original could benefit from is properly named CSS classes! (What on earth do "plus3" and "at2" mean?) As you can see, I've given more meaningful names to my ANSI equivalents.

colordiff

gliks's picture

Hey Richie,

you could just use an existing tool colordiff

sudo apt-get colordiff

great script though.

Cheers.

tkdiff

Anonymous's picture

Perhaps more interactively helpful is a graphical diff viewer such as tkdiff.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState