OpenOffice.org Calc: Pivot tables by another name

DataPilots are OpenOffice.org Calc's equivalent of what MS Excel and other spreadsheets call pivot tables. Under any name, they are a tool for extracting and summarizing the information contained in spreadsheet cells in a more convenient form. Using a DataPilot, you can immediately see relationships between different pieces of data that would be difficult -- if not impossible -- to find using formulas, and tedious to extract manually. In effect, a DataPilot gives you something of the power of using a database without actually switching out of a spreadsheet. Small wonder, then, that over half of spreadsheet users are said to use datapilots or pivot tables.

To understand the usefulness of DataPilots, imagine that you are a manufacturer selling Ogg Vorbis music players in the North American market. Your product comes in two colors, beige and black, and in 80, 150 and 300 megabyte sizes, and is sold in sales packs. For each sale, you record the number of display packs sold, and the total price. On a spreadsheet, some of your data might look like this:

Country Color Size Quantity Price
Canada Beige 80 1 $500.00
USA Black 150 5 $5000.00
Mexico Beige 80 1 $500.00
Mexico Beige 80 2 $1000.00
US Black 150 3 $3000.00
US Beige 300 2 $4000.00
US Beige 300 7 $14,000.00
Canada Black 300 4 $8000.00
Total $36,000.00




Notice that the columns all have labels. These labels are a pre-requisite for working with DataPilots.

When you come to analyze this data, you might want to know such questions as how many units of each color you sold, or how many units in each country. You could find this information by setting up a combination of filters and formulas, but creating a DataPilot is much quicker.

For example, to quickly find out how many sales packs you sold in each country, you could create the following DataPilot:

Filter
Country -all-
Quantity
1 $1,000.00
2 $5000.00
3 $3000.00
4 $8000.00
5 $5000.00
7 $14,000.00




Sales for all countries are shown by default. However, if you use the Country filter at the top of the DataPilot, you can see the sales only for Canada:

Filter
Country  Canada
Quantity
1 $500.00
4 $8000.00




The gray cells in each DataPilot represent filters that you can use to modify it. By clicking the one marked Filter, you can change the information displayed in the DataPilot.

In addition, you can drag the other filters into new positions to change the information display. For example, if you drag the Country filter to the right of the Quantity column in the existing datapilot, it now displays the quantity sold broken down by country, with the first few rows reading:

Filter
Quantity Country
1 Canada $500.00
Mexico $500.00
2 Mexico $1000.00
US $4,000.00




As you can see, an datapilot is an ideal way of obtaining new perspectives on your data with a minimum of effort.


Creating a DataPilot

To begin creating an datapilot, highlight the range of cells you want to base it upon, then select Data -> DataPilot -> Start to open the DataPilot dialog window. Alternatively, choose the same menu item, then select a data source that you have already registered with OpenOffice.org using File -> New -> Database and a range of cells from it.

The DataPilot window gives you a diagram of the DataPilot that you are creating, and a list of columns from the data source. To create the general layout for the DataPilot, all you have to do is drag the columns to one of the blank spaces on the diagram. If you drag a column name to the Column fields or Row fields space, then it becomes the first cell in a row or column, just as you might expect from the name (in the first DataPilot above, Quantity was selected as the column, and no row was chosen). Similarly, if you drag a column name to the Data fields, it becomes the data in the DataPilot (in the first example above, the Price). The only potentially puzzling choice is the Page Fields, which is actually just the custom filter for changing the contents of the DataPilot on the fly (in the first example, the Country). If you make a mistake, you can drag the column back to the list of column building blocks on the right.

Once you have done the basic setup, you can also choose what function to use in the DataPilot. In the examples above, I simply used the default Sum function, which for many purposes is all that you need. However, you can also use another ten basic functions: Count, Average, Max, Min, Product, Count (Numbers Only), StDev (Sample), StDevP (Population), Var (Sample) and Var (Population). If necessary, you can find details about what these functions do in OpenOffice.org's online help.

By default, the DataPilot is written directly below the range it is based upon. However, if you select the More button in the dialog window, you can set the cells or sheet to which it is inserted. You can also set additional options, such as ignoring empty rows. However, for the most part, you can safely ignore these options, especially when just starting to work with DataPilots, because the defaults are the ones you are likely to want most of the time anyway.

After you create the DataPilot, selecting part of it enables Data ->DataPilot -> Refresh if you need to update it because of a change to the source information. The same sub-menu contains a Delete item that you can use when you no longer need the DataPilot.


Conclusion

Learning to work with DataPilots can take some time. New users need to learn that columns, not rows, should contain labels. They also have to keep track of which columns contain data and which do not: If those that do not are dragged to the Data fields in the dialog window, then the result will be nonsense. However, for most people, only a few trial and errors should be needed to master the basics of DataPilots.

Soon, you'll be reading your data in ways you never considered. After that, what you do with DataPilots is up to your imagination.


Bruce Byfield is a computer journalist who writes for Datamation, Linux.com, and Linux Journal

______________________

--
Bruce Byfield (nanday)

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Data Pilot

Sajan's picture

Hi,

I don't think Open Office allowa us to have the Price column to be used more than once in the Column field to depict the AVERAGE PRICE, MAX PRICE, MIN PRICE, MIN PRICE , STD DEVIATION PRICE, etc. I think this become a serious limitation in terms of analysis.

The workaround I supposed it to have as many Data Pilots which becomes a little cumbersome and time consuming.

Sajan

Using the same column more than once

BauMal's picture

You could just make another column right next to the original column, link to it by typing "=" and clicking the originating cell. Make sure to use a different header in the first row. This way you can actually use the very same data for several purposes, like both average and sum.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState