Graphing with Gnuplot and Xmgr
Graphing data is one of the oldest uses for a computer, dating back to FORTRAN programs producing character graphics on line-printers. Fortunately, things have advanced somewhat, and modern computers are capable of producing much nicer graphs. Several graphing packages are available for Linux under X and SVGALIB. Two of the most prominent packages are gnuplot and xmgr (a.k.a. ACE/gr). Xmgr is oriented towards graphing and manipulation of externally produced data sets, while gnuplot is used more for plotting data and mathematical functions.
Gnuplot's primary authors are Thomas Williams and Colin Kelley, with many others contributing. Although gnuplot was written independently of the Free Software Foundation, the FSF does distribute it. Gnuplot was written with portability in mind, supporting about four dozen output devices and formats under a dozen operating systems. Under Linux, it will run under both X and SVGALIB. Modifying gnuplot to support a new device involves writing a few device-dependent subroutines that are linked in with the main program.
Xmgr, on the other hand, is tied to X. Developed by Paul Turner, it also runs on many platforms besides Linux, but it outputs only PostScript. In the latter stages of development, Linux was the primary development platform. Development has recently been spread around to a loose organization of interested people.
Gnuplot has a command-line interface with a mixture of emacs and Unix command line editing similar to the bash shell. Gnuplot may be run in batch mode, where the commands are taken from a file. The plot command causes a plot to be sent to the currently selected device. In the case of the Linux svgalib driver, a graphics mode is selected and a graph is drawn in the current virtual console. When a key is hit, the display changes back to text mode for an additional command. Under X, a new window is created for the graph, while commands are entered in the original shell window.
Gnuplot has a comprehensive on-line help facility that can be accessed by typing help. The basic help command lists arguments of the help command by topic. Some subjects, like the set command, have many sub-topics. The documentation itself is well written and has many valuable examples of working commands.
A datafile containing points to plot is identified by the file name in single or double quotes. Each line has a two or more space-separated numbers that correspond to a point that is to be plotted. For example, suppose we had a file named “hits”:
# Monthly hits on our web site 1 13 2 23 3 66 4 75 5 74 6 82 7 377 8 442 9 512 10 756 11 874 12 946
The command plot "hits" would plot a graph of the data in the file named hits. Lines in a data file beginning with a # character are treated as comment lines. Blank lines are not treated as comments. Instead, they indicate where a line should not be drawn between a pair of points.
Although our example has the x data listed in the first column and the y in the second, gnuplot can handle cases where this is not so. The command:
plot "hits" using 2:1
would cause the x data to be read from the second column and the y data from the first column.
Plots can be embellished in many ways. Each comma-separated file or mathematical expression (shown later) to plot has two attributes that can be specified by the user: a title and a style. A gnuplot “title” is a string that is displayed with an example of the plot style that labels that data; this is usually called a “legend” by other programs. The style of the plot is selected from several possible ones, including “points”, which displays a symbol at each data point, “lines”, which draws lines between the points, and “linespoints”, which draws both the lines and the symbols. The color of the line and symbol as well as the type of symbol (plus sign, cross, box) are normally assigned in series by gnuplot to make each distinct, but these can be overridden by the user.
For example, the command plot "hits" title 'Hits on Website' with linespoints 3 4 plots our data file using lines of type 3 and points of type 4. At the top right will be the string Hits on Website next to a short example of type 3 lines and type 4 points. What you actually see depends on the output device being used—lines that are colored on a color display can come out dashed and dotted on monochrome devices (like most PostScript printers).
Our plot is looking better, but it is still not perfect. We want to put labels on the x and y axes to further clue the reader in on what we are looking at. Axis labels are setable parameters, as is the graph title:
set xlabel "Month" set ylabel "Hits" set title "Hits on the Website" replot
Experimentation is easy to do in gnuplot by using the replot command, which repeats the previous plot command. Not only does this save keystrokes, but the author has a friend who likes to type replot repeatedly to display a file being appended to by another job he is running, which gives a running display of results as they are calculated.
Our graph is almost finished. Gnuplot's default algorithm for deciding where the x tick marks appear is showing only every other x point. We can make it show them all by:
set xtics 1, 1 replot
The first number causes the tick marks to start at x=1, and the second causes them to be spaced one unit apart. We could have included a third comma-separated parameter to indicate where the last tick mark should be plotted, but it is unnecessary in our example.
We can do better than month numbers:
set xtics ("Jan" 1, "Feb" 2, "Mar" 3, "Apr" 4, "May" 5, "Jun" 6, "Jul" 7, "Aug" 8, "Sep" 9, "Oct" 10, "Nov" 11, "Dec" 12) replot
We have arrived at a graph worthy of being shown to the boss. The result is shown in Figure 1.
All that remains is to print it out. Gnuplot treats printers and plotters as just another output device. Executing the command:
set terminal PostScript
tells gnuplot to generate PostScript of the graph instead of console graphics. It is not enough to set the type of terminal. Typing replot now will cause gnuplot to spew PostScript to the user terminal. The command:
set output "graph.ps" replot
will cause PostScript to be sent to the file graph.ps. If the first character of the filename is a vertical bar, gnuplot interprets the rest of the string as a program that will accept gnuplot's output as its standard input. So a command like:
set output "|lp" replot
sends the output to the system's default printer.
Plots of mathematical functions are easy to produce: plot 2*x will produce a plot of a line with a slope of two on the default range of -10.0 to +10.0. The y-axis is automatically scaled by default so that all points are visible. For a mathematical function, the x range is taken from a default range. Multiple plots can be overlaid, with separate expressions separated by a comma.
A wide variety of common mathematical functions can be used in expressions—trigonometric, exponential and logarithmic as well as less common functions such as bessel functions and error functions. Expressions are based mostly on C-style expressions including the logical AND (&&) and OR (||) operators with the notable addition of the FORTRAN power operator (**).
Ranges are specified by a pair of numbers separated by a colon enclosed in square braces. Either or both numbers may be omitted to avoid affecting the current default. The first number specifies the range to begin and the second specifies the end. If we wanted to look at several graphs with the same range, the default range can be changed with the command set xrange [1:2]. If we wanted to change the range in only one plot, a range can be specified before the first function being plotted.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- Linux Systems Administrator
- New Products
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Using Salt Stack and Vagrant for Drupal Development
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Have you tried Boxen? It's a
3 hours 43 min ago
- seo services in india
8 hours 15 min ago
- For KDE install kio-mtp
8 hours 15 min ago
- Evernote is much more...
10 hours 15 min ago
- Reply to comment | Linux Journal
19 hours 1 min ago
- Dynamic DNS
19 hours 35 min ago
- Reply to comment | Linux Journal
20 hours 33 min ago
- Reply to comment | Linux Journal
21 hours 24 min ago
- Not free anymore
1 day 1 hour ago
1 day 5 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?