Gri: A Language for Scientific Illustration
Whereas line graphs can be accomplished in as little as three commands, contour graphs are not much more complicated. Suppose we have a file called temp.dat that contains measurements of ocean temperature made once per decade, over the period from 1950 until 2000, at depths of 500 meters, 400 meters, 300 meters and so on, up to the surface at 0 meters. The natural way to store such a dataset is in a matrix or “grid” in Gri. There are three steps to drawing this in Gri.
# First, set up the x,y vertices of # the grid ... set x grid 1950 2000 10 set y grid 0 500 100 # then read the grid data ... open temp.dat read grid data # and plot contours: draw contour
As you can see, comments in Gri start with the pound sign, as in many other scripting languages. In this example, a uniform grid is used (i.e., time increasing by 10 years between samples), but non-uniform grids are also easy to handle. For many applications, grids are read in from data files instead of being specified in the command file. As for the drawing of the contours, that's fairly easy since Gri can determine, by scanning the grid, a reasonable default range of contours.
Images are not much harder to deal with. There are many image formats in this world, and Gri handles only a few. This causes few problems, since good conversion programs are available (e.g., ImageMagick). Let's suppose we have a raw (headerless) image file in 8-bit resolution. The 8-bit pixels have 256 possible values, in the range 0 to 255. Satellite-derived measurements of ocean temperature are typically in this format. A common scale for the data is that 4.9 Celsius gets a pixel value of 0 while 30.4 Celsius gets 255, with linear variation in between. Let's say that the image is 512 pixels by 512 pixels, and that the geographical coverage spans a box with the lower-left corner at x=0 and y=0 and the upper-right corner at x=20 and y=20. We want the image drawn with blue for cold water and red for warm water. Listing 1 shows how to handle this in Gri. As you can see, it is quite easy to make Gri create line graphs, contour graphs and image graphs.
A quick note about file names and other properties. In the examples given so far, we always specified the names of the data files in the script, but that means users have to modify the file for use with other data. For this reason, many scripts use a command called query to ask the user the name of the input file. This command stores a user's answers into variables that may later be used in open commands. The query command has an option to provide a list of permitted answers (e.g., “yes” and “no”) so that users cannot make mistakes. In many laboratories, query is used extensively, so that novices can use complicated Gri scripts without having to edit Gri scripts they don't yet understand. This lets students dive into their research instead of getting too distracted with the tools of the work.
Let's turn to a more complicated example, based on an illustration from one of our research papers. Here, and in the remainder of this article, we will show only fragments of the Gri code, to illustrate topics not already covered.
A key issue in understanding the climate system is the interaction between the ocean and the atmosphere. The heat capacity of the top three meters of the ocean is equal to that of the entire atmosphere, but everyone knows the ocean is more than three meters deep. In fact, the mean depth is more like three kilometers. Given this, it is not surprising that the ocean is important to the heat balance of the planet and thus to the climate system. Beyond the ability to soak up heat from the sun (or re-release it), the ocean has currents that transport heat from one location to another. The exact pathways of this transport are not fully mapped out yet, and we are still unsure of some fundamental dynamical aspects of the system. A great deal of evidence suggests that vertical mixing in the ocean is very important to these patterns of heat flow. Thus, ocean mixing is important not just to the ocean itself, but also to the whole climate system. This is a prime motivation for studies of ocean mixing, which is our research specialty.
One good way to observe mixing is to insert a tracer and see where it goes. This is not as easy as it sounds, so oceanographers also make great use of tracers that are naturally occurring or introduced without the specific intention of studying mixing. An example of the latter is a suite of radionuclides introduced into the atmosphere by atmospheric bomb testing carried out by the U.S. and the U.S.S.R. in the 1960s. These radionuclides have found their way into the ocean. One radionuclide, a hydrogen isotope named tritium, is especially useful for studying ocean mixing. Figure 2 shows a plot from a paper (see Resources) in which one of us, together with a geochemist colleague named Kim Van Scoy, tried to work out the rate of ocean mixing by tracing the movement of tritium over several decades.
Our approach was to do time derivatives of properties that are spatially averaged over a particular geographical area. The area is defined by the streamlines of the average ocean circulation pattern. Now, averaging gridded data is easy: just sum along the rows and columns of the grid. Unfortunately, however, our observations were not made on a grid. As Figure 2 shows, the observations are at manifestly non-uniform locations which correspond to ship tracks on cruises designed with other goals in mind. Our first step, then, is to cast these x,y,z data onto a uniform x,y grid. After reading in the columns and setting up an x,y grid (exactly as discussed above), we create the grid by using
convert columns to grid barnes
This is our first example of a convert command. Commands in this category perform conversions from one data type to another. Since contouring works on a grid, we must convert column data into the grid. In this case, we have chosen to use the so-called “barnes” method of gridding, which is but one of several gridding methods in Gri (see Resources). The method applies a Gaussian-weighted low-pass filtering (averaging) scheme that is run iteratively. Initial iterations map out the large-scale variation in the field, while later iterations fill in details in regions where the data sampling is intense enough to justify doing so.
In the previous example, we let Gri pick contours; in fact, that is how we started out with the working plots that led to the diagram shown in Figure 2. For publication, however, we ended up deciding to highlight certain contours by drawing them with thicker lines:
set line width rapidograph 00 draw contour 50 unlabelled set line width rapidograph 2 draw contour 100
where the “rapidograph” method of naming line thicknesses by the scheme used for Rapidograph technical fountain pens has been used. Line widths may also be given in units of printer points.
At this stage, we had the scientific result we wanted. We used the Gri command write grid to write the gridded data into a file, then we integrated the values with a Perl script (called from within Gri with a system command), then we were done with part of the work. For presentation, we wanted to draw this in a map format, so we started by customizing the axes a bit:
set x name "" set y name "" set x format %.0lf$\circ$ set y format %.0lf$\circ$
The first two commands remove the names x and y from the axes by setting the names to blank strings. The second two set the format of the axis numbers, using the notation of the C programming language. We also added a LaTeX-like part to the format (enclosed in dollar signs) to make Gri draw degree signs for longitude and latitude. Gri can draw Greek letters and mathematical symbols in this way, although the emulation of TeX is by no means complete, since it is a daunting task.
A map isn't too helpful without coastlines. Since coastline files are quite big, our system retains the data in a binary format for quick reading. We use the format called netCDF, which is quite popular in earth science (see http://www.unidata.ucar.edu/packages/netcdf/). Out of the many advantages to this format, one is quite relevant to Linux users: it is fully binary-compatible across big-endian and little-endian computers and across computers with different word lengths. It also permits naming of data entities, so you don't have to remember that the first column is latitude and the second longitude, or vice versa. Gri handles netCDF format easily:
open map_land.nc netCDF read columns x="lon" y="lat"
is all it takes to read the coastline data. We'll fill in the coastline with a light brown color, and then draw a black coastline:
read colornames from RGB "/usr/lib/X11/rgb.txt" set color burlywood draw curve filled set color black draw curveIn addition to X11-based colors and a dozen or so familiar colors above like black, Gri permits you to specify colors in either an RGB or an HSV framework.
|Updates from LinuxCon and ContainerCon, Toronto, August 2016||Aug 23, 2016|
|NVMe over Fabrics Support Coming to the Linux 4.8 Kernel||Aug 22, 2016|
|What I Wish I’d Known When I Was an Embedded Linux Newbie||Aug 18, 2016|
|Pandas||Aug 17, 2016|
|Juniper Systems' Geode||Aug 16, 2016|
|Analyzing Data||Aug 15, 2016|
- Updates from LinuxCon and ContainerCon, Toronto, August 2016
- What I Wish I’d Known When I Was an Embedded Linux Newbie
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- NVMe over Fabrics Support Coming to the Linux 4.8 Kernel
- All about printf
- New Version of GParted
- Tech Tip: Really Simple HTTP Server with Python
- A New Project for Linux at 25
- Better Cloud Storage with ownCloud 9.1
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide