Scientific Graphing in Python
In my last few articles, I looked at several different Python modules that are useful for doing computations. But, what tools are available to help you analyze the results from those computations? Although you could do some statistical analysis, sometimes the best tool is a graphical representation of the results. The human mind is extremely good at spotting patterns and seeing trends in visual information. To this end, the standard Python module for this type of work is matplotlib. With matplotlib, you can create complex graphics of your data to help you discover relations.
You always can install matplotlib from source; however, it's easier to install it from your distribution's package manager. For example, in Debian-based distributions, you would install it with this:
sudo apt-get install python-matplotlib
The python-matplotlib-doc package also includes extra documentation for matplotlib.
Like other large Python modules, matplotlib is broken down into several sub-modules. Let's start with pyplot. This sub-module contains most of the functions you will want to use to graph your data. Because of the long names involved, you likely will want to import it as something shorter. In the following examples, I'm using:
import matplotlib.pyplot as plt
The underlying design of matplotlib is modeled on the graphics module for the R statistical software package. The graphical functions are broken down into two broad categories: high-level functions and low-level functions. These functions don't work directly with your screen. All of the graphic generation and manipulation happens via an abstract graphical display device. This means the functions behave the same way, and all of the display details are handled by the graphics device. These graphics devices may represent display screens, printers or even file storage formats. The general work flow is to do all of your drawing in memory on the abstract graphics device. You then push the final image out to the physical device in one go.
The simplest example is to plot a series of numbers stored as a list. The code looks like this:
The first command plots the data stored in the given list in a regular
scatterplot. If you have a single list of values, they are assumed to be
the y-values, with the list index giving the x-values. Because you did not
set up a specific graphics device, matplotlib assumes a default device
mapped to whatever physical display you are using. After executing the
first line, you won't see anything on your display. To see
something, you need to execute the second
show() command. This
pushes the graphics data out to the physical display (Figure 1).
You should notice that there are several control buttons along the
bottom of the window, allowing you to do things like save the image
to a file. You also will notice that the graph you generated is
rather plain. You can add labels with these commands:
plt.xlabel('Index') plt.ylabel('Power Level')
Figure 1. A basic scatterplot window includes controls on the bottom of the pane.
You then get a graph with a bit more context (Figure 2). You
can add a title for your plot with the
command, and the
plot command is even more versatile than that. You can change the plot
graphic being used, along with the color. For example, you can make green
triangles by adding
g^ or blue circles with
bo. If you want more than
one plot in a single window, you simply add them as extra options to
plot(). So, you could plot squares and cubes on the same plot with
something like this:
t = [1.0,2.0,3.0,4.0] plt.plot(t,[1.0,4.0,9.0,16.0],'bo',t,[1.0,8.0,27.0,64.0],'sr') plt.show()
Figure 2. You can add labels with the xlabel and ylabel functions.
Now you should see both sets of data in the new plot window (Figure 3). If you import the numpy module and use arrays, you can simplify the plot command to:
Figure 3. You can draw multiple plots with a single command.
What if you want to add some more information to your plot, maybe a text
box? You can do that with the
text() command, and you can set the location
for your text box, along with its contents. For example, you could use:
plt.text(3,3,'This is my plot')
This will put a text area at x=3, y=3. A specialized form of text box is
an annotation. This is a text box linked to a specific point of data. You
can define the location of the text box with the
xytext parameter and
the location of the point of interest with the
xy parameter. You
even can set the details of the arrow connecting the two with the
parameter. An example may look like this:
plt.annotate('Max value', xy=(2, 1), xytext=(3, 1.5), ↪arrowprops=dict(facecolor='black', shrink=0.05),)
Several other high-level plotting commands are available.
bar() command lets you draw a barplot of your data. You can change
the width, height and colors with various input parameters. You even
can add in error bars with the
yerr parameters. Similarly, you
can draw a horizontal bar plot with the
command. Or, you can draw box and whisker
plots with the
boxplot() command. You can create plain contour
plots with the
contour() command. If you want
filled-in contour plots,
hist() command will draw a histogram,
with options to control items like the bin size. There is even a command
xkcd() that sets a number of parameters so all of the
subsequent drawings will be in the same style as the xkcd comics.
Sometimes, you may want to be able to interact with your
graphics. matplotlib needs to interact with several different toolkits,
like GTK or Qt. But, you don't want to have to write code for every
possible toolkit. The pyplot sub-module includes the ability to add event
handlers in a GUI-agnostic way. The FigureCanvasBase class contains
a function called
mpl_connect(), which you can use to connect some
callback function to an event. For example, say you have a function
onClick(). You can attach it to the button press event with
fig = plt.figure() ... cid = fig.canvas.mpl_connect('button_press_event', onClick)
Now when your plot gets a mouse click, it will fire your callback
function. It returns a connection ID, stored in the variable
cid in this
example, that you can use to work with this callback function. When you
are done with the interaction, disconnect the callback function with:
If you just need to do basic interaction, you can use the
command. It will listen for a set amount of time and return a list of
all of the clicks that happen on your plot. You then can process those
clicks and do some kind of interactive work.
The last thing I want to cover here is animation. matplotlib includes a sub-module called animation that provides all the functionality that you need to generate MPEG videos of your data. These movies can be made up of frames of various file formats, including PNG, JPEG or TIFF. There is a base class, called Animation, that you can subclass and add extra functionality. If you aren't interested in doing too much work, there are included subclasses. One of them, FuncAnimation, can generate an animation by repeatedly applying a given function and generating the frames of your animation. Several other low-level functions are available to control creating, encoding and writing movie files. You should have all the control you require to generate any movie files you may need.
Now that you have matplotlib under your belt, you can generate some really stunning visuals for your latest paper. Also, you will be able to find new and interesting relationships by graphing them. So, go check your data and see what might be hidden there.
Joey Bernard has a background in both physics and computer science. This serves him well in his day job as a computational research consultant at the University of New Brunswick. He also teaches computational physics and parallel programming.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Paranoid Penguin - Building a Secure Squid Web Proxy, Part IV
- SUSE LLC's SUSE Manager
- Google's SwiftShader Released
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide