OpenOffice.org Calc: Pivot tables by another name

DataPilots are OpenOffice.org Calc's equivalent of what MS Excel and other spreadsheets call pivot tables. Under any name, they are a tool for extracting and summarizing the information contained in spreadsheet cells in a more convenient form. Using a DataPilot, you can immediately see relationships between different pieces of data that would be difficult -- if not impossible -- to find using formulas, and tedious to extract manually. In effect, a DataPilot gives you something of the power of using a database without actually switching out of a spreadsheet. Small wonder, then, that over half of spreadsheet users are said to use datapilots or pivot tables.

To understand the usefulness of DataPilots, imagine that you are a manufacturer selling Ogg Vorbis music players in the North American market. Your product comes in two colors, beige and black, and in 80, 150 and 300 megabyte sizes, and is sold in sales packs. For each sale, you record the number of display packs sold, and the total price. On a spreadsheet, some of your data might look like this:

Country Color Size Quantity Price
Canada Beige 80 1 $500.00
USA Black 150 5 $5000.00
Mexico Beige 80 1 $500.00
Mexico Beige 80 2 $1000.00
US Black 150 3 $3000.00
US Beige 300 2 $4000.00
US Beige 300 7 $14,000.00
Canada Black 300 4 $8000.00
Total $36,000.00




Notice that the columns all have labels. These labels are a pre-requisite for working with DataPilots.

When you come to analyze this data, you might want to know such questions as how many units of each color you sold, or how many units in each country. You could find this information by setting up a combination of filters and formulas, but creating a DataPilot is much quicker.

For example, to quickly find out how many sales packs you sold in each country, you could create the following DataPilot:

Filter
Country -all-
Quantity
1 $1,000.00
2 $5000.00
3 $3000.00
4 $8000.00
5 $5000.00
7 $14,000.00




Sales for all countries are shown by default. However, if you use the Country filter at the top of the DataPilot, you can see the sales only for Canada:

Filter
Country  Canada
Quantity
1 $500.00
4 $8000.00




The gray cells in each DataPilot represent filters that you can use to modify it. By clicking the one marked Filter, you can change the information displayed in the DataPilot.

In addition, you can drag the other filters into new positions to change the information display. For example, if you drag the Country filter to the right of the Quantity column in the existing datapilot, it now displays the quantity sold broken down by country, with the first few rows reading:

Filter
Quantity Country
1 Canada $500.00
Mexico $500.00
2 Mexico $1000.00
US $4,000.00




As you can see, an datapilot is an ideal way of obtaining new perspectives on your data with a minimum of effort.


Creating a DataPilot

To begin creating an datapilot, highlight the range of cells you want to base it upon, then select Data -> DataPilot -> Start to open the DataPilot dialog window. Alternatively, choose the same menu item, then select a data source that you have already registered with OpenOffice.org using File -> New -> Database and a range of cells from it.

The DataPilot window gives you a diagram of the DataPilot that you are creating, and a list of columns from the data source. To create the general layout for the DataPilot, all you have to do is drag the columns to one of the blank spaces on the diagram. If you drag a column name to the Column fields or Row fields space, then it becomes the first cell in a row or column, just as you might expect from the name (in the first DataPilot above, Quantity was selected as the column, and no row was chosen). Similarly, if you drag a column name to the Data fields, it becomes the data in the DataPilot (in the first example above, the Price). The only potentially puzzling choice is the Page Fields, which is actually just the custom filter for changing the contents of the DataPilot on the fly (in the first example, the Country). If you make a mistake, you can drag the column back to the list of column building blocks on the right.

Once you have done the basic setup, you can also choose what function to use in the DataPilot. In the examples above, I simply used the default Sum function, which for many purposes is all that you need. However, you can also use another ten basic functions: Count, Average, Max, Min, Product, Count (Numbers Only), StDev (Sample), StDevP (Population), Var (Sample) and Var (Population). If necessary, you can find details about what these functions do in OpenOffice.org's online help.

By default, the DataPilot is written directly below the range it is based upon. However, if you select the More button in the dialog window, you can set the cells or sheet to which it is inserted. You can also set additional options, such as ignoring empty rows. However, for the most part, you can safely ignore these options, especially when just starting to work with DataPilots, because the defaults are the ones you are likely to want most of the time anyway.

After you create the DataPilot, selecting part of it enables Data ->DataPilot -> Refresh if you need to update it because of a change to the source information. The same sub-menu contains a Delete item that you can use when you no longer need the DataPilot.


Conclusion

Learning to work with DataPilots can take some time. New users need to learn that columns, not rows, should contain labels. They also have to keep track of which columns contain data and which do not: If those that do not are dragged to the Data fields in the dialog window, then the result will be nonsense. However, for most people, only a few trial and errors should be needed to master the basics of DataPilots.

Soon, you'll be reading your data in ways you never considered. After that, what you do with DataPilots is up to your imagination.


Bruce Byfield is a computer journalist who writes for Datamation, Linux.com, and Linux Journal

______________________

--
Bruce Byfield (nanday)

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Data Pilot

Sajan's picture

Hi,

I don't think Open Office allowa us to have the Price column to be used more than once in the Column field to depict the AVERAGE PRICE, MAX PRICE, MIN PRICE, MIN PRICE , STD DEVIATION PRICE, etc. I think this become a serious limitation in terms of analysis.

The workaround I supposed it to have as many Data Pilots which becomes a little cumbersome and time consuming.

Sajan

Using the same column more than once

BauMal's picture

You could just make another column right next to the original column, link to it by typing "=" and clicking the originating cell. Make sure to use a different header in the first row. This way you can actually use the very same data for several purposes, like both average and sum.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix