An Introduction to GraphViz
The following example presents a graphical representation of an RCS revision tree structure. Again, Perl is the programming language of choice. As you can imagine, the Perl script is not perfect in the sense that it does not cover all the possible revision tree structures. In this example, small revision trees are presented for illustration purposes. For complex revision trees, greater consideration of the specialties of the revision tree must be made when writing the Perl script.
The basic idea behind the script is to first get the output we want from the rlog command. The rlog command returns a lot of information about the change history of the file, so we have to grep the output to get the desirable data. The rlog utility is included in the RCS revision control system.
The key point in the script is to separate the revision branches by the first part of the revision number. This way, each revision branch has its own subgraph.
Notice that you have to put the quotation marks around the revision names. Otherwise, the output is not displayed properly. See Figure 7 for the output and Listing 7 for the Perl source code.
Listing 7. Perl Source Code for Figure 7
#!/usr/bin/perl -w
# $Id: RCS.pl,v 1.2 2004/02/05 12:21:40 mtsouk Exp mtsouk $
#
# Command line arguments
# program_name.pl file_with_RCS_info
use strict;
my $filename="";
my $COMMAND="";
my %revision=();
# Change that according to your system
my $RLOG="/usr/bin/rlog";
my $rev=0;
my $count=0;
my %branch=();
my $date=0;
die <<Thanatos unless @ARGV;
usage:
$0 file_with_RCS_info
Thanatos
if ( @ARGV != 1 )
{
die <<Thanatos
usage info:
Please use exactly 1 argument!
Thanatos
}
# Get the file name
($filename) = @ARGV;
$COMMAND = "$RLOG $filename |";
#
# Do not forger to change the path
# of the grep command
#
$COMMAND = $COMMAND." /bin/grep ^revision -A 1 ";
open (RCSINFO, "$COMMAND |")
|| die "Cannot run the ".$COMMAND.": $!\n";
my $line = "";
my $connect="";
while ($line = <RCSINFO>)
{
if ( $line =~ /^revision/ )
{
$rev = (split " ", $line)[1];
$count = (split /\./, $rev)[0];
if ( defined $branch{$count} )
{
my $number = $branch{$count};
$number++;
$branch{$count} = $number;
}
else
{
$branch{$count}=1;
$connect.="\"$filename\" -> \"rev$rev\";";
}
$line = <RCSINFO>;
if ( $line =~ /^date:/ )
{
$date = (split / /, $line)[1];
$revision{$rev} = $date;
}
}
}
close(RCSINFO)
|| die "Cannot close RCSINFO: $!\n";
my $FILE = $filename.".dot";
open (OUTPUT, "> $FILE" )
|| die "Cannot create file: $!\n";
print OUTPUT <<START;
digraph G
{
"$filename" [shape=Msquare];
node [ style=filled, color=lightgray];
node [ height=.50, width=.65];
START
#
# Now we will process the output of the rlog command
# We want to get the revision number,
# the date and the username
#
my $k="";
foreach $k (sort keys %revision)
{
print "$k => $revision{$k}\n";
print OUTPUT <<DATA;
"rev$k" [shape=egg, label="$k\\n$revision{$k}"];
DATA
}
my $ll="";
foreach $ll (keys %branch)
{
print OUTPUT "\t";
foreach $k (sort keys %revision)
{
my $major = (split /\./, $k)[0];
if ( $ll == $major )
{
my $counter = $branch{$ll};
$counter--;
$branch{$ll} = $counter;
print OUTPUT "\"rev".$k."\"";
if ( $counter > 0 )
{
print OUTPUT " -> ";
}
else
{
print OUTPUT ";\n";
}
}
}
}
print OUTPUT <<END;
\t$connect
}
END
close (OUTPUT)
|| die "Cannot close file: $!\n";
exit 0;
Our next example is of a graph more commonly needed: a graphical representation of operating system directory structures. Keep in mind there is a limitation to the number of directories that can be presented, due to page dimension restrictions. This Perl script has been tested with a small number of directories. Also, the directory structure itself is the most important factor for the output quality. Meaning, if we can imagine the directory structure as a tree structure, it can make a great difference if the maximum levels of the tree are four and if the maximum levels of the tree are ten, with most of the limbs having a depth of three.
Notice that each box does not display the full path name, only the last part of the directory name. You can find the full path name by following the links. Please see the source code to understand how the script works. Keep in mind that the source code was not tested fully; it is simply presented to show what the dot language can do with a little help from a script language. See Figure 8 for the output and Listing 8 for the Perl source code.
Listing 8. Perl Source Code for Figure 8
#!/usr/bin/perl -w
# $Id: DIR.pl,v 1.3 2004/02/06 15:18:13 mtsouk Exp mtsouk $
#
# Please note that this is alpha code
#
# Command line arguments
# program_name.pl directory
use strict;
my $directory="";
my $COMMAND="";
my %DIRECTORIES=();
die <<Thanatos unless @ARGV;
usage:
$0 directory
Thanatos
if ( @ARGV != 1 )
{
die <<Thanatos
usage info:
Please use exactly 1 argument!
Thanatos
}
# Get the file name
($directory) = @ARGV;
$COMMAND = "/usr/bin/find $directory -type d | ";
open (INPUT, "$COMMAND")
|| die "Cannot run the ".$COMMAND.": $!\n";
#
# The reason for putting OUTPUT in front of the
# directory name is that we
# can have . as directory name
#
my $OUTPUT="OUTPUT$directory.dot";
$OUTPUT =~ s/\//-/g;
open (OUTPUT, "> $OUTPUT")
|| die "Cannot create output file $OUTPUT: $!\n";
print OUTPUT <<START;
digraph G
{
rotate=90;
nodesep=.05;
node[height=.05, shape=record, fontsize=5];
START
# Make nodes for the command line argument directory
my @split = split /\//, $directory;
my $key="";
my $prev=undef;
for $key (@split)
{
my $KEY=$key;
$key =~ s/[^[a-zA-Z0-9]/_/g;
$key = $prev."_".$key;
$prev = $key;
print OUTPUT "\t".$prev;
print OUTPUT " [shape=box, label=\"$KEY\"];";
print OUTPUT "\n";
}
my $lastpart = "";
while (<INPUT>)
{
chomp;
my $orig=$_;
# Get the right label
my @split = split /\//, $_;
$lastpart = pop @split;
$_ =~ s/\//_/g;
#
# The _ is accepted as a valid node character
# . , + - are not accepted
#
$_ =~ s/[^a-zA-Z0-9]/_/g;
my @split = split /_/, $_;
print OUTPUT "\t_".$_;
print OUTPUT " [shape=box,label=\"$lastpart\"];";
print OUTPUT "\n";
$DIRECTORIES{$orig}=0;
}
my $subdir="";
my %TEMP=();
foreach $key ( sort keys %DIRECTORIES )
{
print "KEY: $key\n";
my @split = split /\//, $key;
my $prev = undef;
for $subdir (@split)
{
$subdir =~ s/[^a-zA-Z0-9_]/_/g;
my $next = $prev."_".$subdir;
# print "NEXT: $next\n";
if ( !defined($prev) )
{
$prev = $next;
next;
}
my $val = "$prev->$next;\n";
# print "VAL: $val\n";
if ( !defined( $TEMP{$val} ))
{
print OUTPUT "$prev->$next;\n";
}
$prev .= "_".$subdir;
$TEMP{$val}=1;
}
}
close(INPUT)
|| die "Cannot close input file: $!\n";
print OUTPUT <<END;
}
END
close (OUTPUT)
|| die "Cannot close file: $!\n";
exit 0;
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Designing Electronics with Linux | May 22, 2013 |
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
- I once had a better way I
2 hours 36 min ago - Not only you I too assumed
2 hours 53 min ago - another very interesting
4 hours 46 min ago - Reply to comment | Linux Journal
6 hours 40 min ago - Reply to comment | Linux Journal
13 hours 34 min ago - Reply to comment | Linux Journal
13 hours 50 min ago - Favorite (and easily brute-forced) pw's
15 hours 41 min ago - Have you tried Boxen? It's a
21 hours 33 min ago - seo services in india
1 day 2 hours ago - For KDE install kio-mtp
1 day 2 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?





Comments
Re: An Introduction to GraphViz
Readers might be wondering why dia doesn't include dot, or KDE, or...
And the answer's the license, which is kind of open source, but not quite. OSI don't seem to regard it as so, anyway, and it's in nonfree in Debian.
However, for those many cases where the license doesn't matter so long as you can get the tool, graphviz is a great set of tools to have around.
Re: An Introduction to GraphViz
Any reason you didn't mention Leon Brocard's excellent GraphViz Perl module, which makes this so much easier?
http://search.cpan.org/~lbrocard/GraphViz-2.00/
To say nothing of the other GraphViz related modules on CPAN.
Re: An Introduction to GraphViz
Great article!
I wrote a web log analyzer that outputs a graphviz
graph from the web server logs (www.hping.org/visitors)
and I think I can modify the program to produce a better
output thanks to this article.
Btw a note about the graphviz's license: it's not opensource
if I remember correctly (or at least not an OSI approved
license).
Graphviz (AT&T source code) license
Thank you for the nice articles about Graphviz.
We're aware of the problems about the AT&T Source
Code License, and expect to have some good news shortly.
Stephen
Graphviz is now under the CPL
We hope this will be beneficial to the open source community.
Please visit www.graphviz.org for downloads and other info.