OpenOffice.org Calc: Basic arithmetic and statistic functions

For novices, functions are one of the most intimidating features of OpenOffice.org's Calc. Newcomers quickly learn that functions are a main feature of spreadsheets, but there are almost four hundred, and many require input that assume specialized knowledge. Nor are the often circular definitions of the Function Wizard much help: You are no better off for reading that the SKEW function "returns the skewness of a distribution," for example. However, Calc includes dozens of functions that anyone can use, the most basic of which create formulas for basic arithmetic or for evaluating numbers in a range of cells.

But before plunging into the examples below, take the time to review the basic structure of functions. A function is the building block for a spreadsheet equation. Technically, it becomes a formula when arguments are entered into it, although the two terms are often used interchangeablely. Both functions and formulas begin with an equal sign, and take either numbers or cell addresses as arguments. In some cases, an argument takes a specific number, which has a particular meaning you probably have to look up.

Cell addresses consist of a letter for the column and a number for the row, so that A1 is the top left cell. You can enter a range of cells by specifying the first and the last cell in the range, separated by a colon; for example, A1:A4 includes cells A1 and A4, as well as A2 and A3.

You can enter formulas using either the formula bar or by selecting Insert -> Function, which opens the Function Wizard. The Function Wizard is the easiest method for beginners, since you can choose a range of cells from it with the mouse, the Wizard's window conveniently minimizing so that you can see the spreadsheet more easily. The Wizard also includes input fields and displays errors, both of which can become useful as you become more experienced.

With this quick recap in mind, you are now ready to try some useful simple functions:

Basic arithmetic

The most basic functions in Calc are simple arithmetic: Addition, subtraction, multiplication, and division (or, as Alice in Wonderland calls them, Ambition, Distraction, Uglification, and Derision).

Except for subtraction, each of these operations has its own function. You can use SUM for addition, PRODUCT for multiplication, and QUOTIENT for division. Traditionally, subtraction does not have a function, apparently on the grounds that it is too simple for one to be any of benefit -- although the same could be said, it seems to me, about the other basic functions.

SUM, PRODUCT and QUOTIENT are useful for entering ranges of cells in the same way as any other function, with arguments in brackets after the function name.

However, for basic equations, many users prefer the time-honored computer symbols for these operations, using the plus sign (+) for addition, the hyphen (-) for subtraction, the asterisk (*) for multiplication and the forward slash (/) for division. These symbols are quick to enter without requiring your hands to stray from the keyboard.

A similar choice is also available if you want to raise a number by the power of another. Instead of entering =POWER(A1;2), you can enter =A1^2.

Moreover, they have the advantage that you enter formulas with them in an order that more closely approximates human readable format than the spreadsheet-readable format used by the equivalent function. For instance, instead of entering =SUM (A1:A2), or possibly =SUM (A1;A2), you enter =A1+A2. This almost-human readable format is especially useful for compound operations, where writing =A1*(A2+A3) is briefer and easier to read than =PRODUCT(A1;SUM(A2:A3)).

The main disadvantage of using arithmetical operators is that you cannot directly use a range of cells. In other words, to enter the equivalent of =SUM (A1:A3), you would need to type =A1+A2+A3

.

Otherwise, whether you use a function or an operator is largely up to you -- except, of course, when you are subtracting. However, if you use spreadsheets regularly in a group setting such as a class or an office, you might want to standardize on an entry format so that everyone who handles a spreadsheet becomes accustomed to a standard input.

Simple statistics

Another common use for spreadsheet functions is to pull useful information out of a list, such as a series of test scores in a class, or a summary of earnings per quarter for a company.

You can, of course, scan a list of figures if you want basic information such as the highest or lowest entry or the average. The only trouble is, the longer the list, the more time you waste and the more likely you are to miss what you're looking for. Instead, it is usually quicker and more efficient to enter a function. Such reasons explain the existence of a function like COUNT, which does no more than give the total number of entries in the designated cell range.

Similarly, to find the highest or lowest entry, you can use MIN or MAX. For each of these formulas, all arguments are either a range of cells, or a series of cells entered individually.

Each also has a related function, MINA or MAXA, which performs the same function, but treats a cell formatted for text as having a value of 0 (The same treatment of text occurs in any variation of another function that adds an "A" to the end). Either function gives the same result, and could be useful if you used a text notation to indicate, for example, if any student were absent when a test was written, and you wanted to check whether you needed to schedule a makeup exam.

For more flexibility in similar operations, you could use LARGE or SMALL, both of which add a specialized argument of rank. If the rank is 1 used with LARGE, you get the same result as you would with MAX. However, if the rank is 2, then the result is the second largest result. Similarly, a rank of 2 used with SMALL gives you the second smallest number. Both LARGE and SMALL are handy as a permanent control, since, by changing the rank argument, you can quickly scan multiple results.

You would need to be an expert to want to find the Poisson Distribution of a sample, or to find the skew or negative binominal of a distribution (and, if you are, you will find functions in Calc for such things). However, for the rest of us, there are simpler statistical functions that you can quickly learn to use.

In particular, if you need an average, you have a number to choose from. You can find the arithmetical means -- that is, the result when you add all entries in a list then divided by the number of entries by enter a range of numbers when using AVERAGE, or AVERAGE A to include text entries and to give them a value of zero.

In addition, you can get several other forms of averages:

  • MEDIAN: The entry that is exactly half way between the highest and lowest number in a list.
  • MODE: The most common entry in a list of numbers.
  • QUARTILE:The entry at a set position in the array of numbers. Besides the cell range, you enter the type of Quartile: O for the lowest entry, 1 for the value of 25%, 2 for the value of 50%, 3 for 75%, and 4 for the highest entry. Note that the result for types 1 through 3 may not represent an actual item entered.
  • RANK: The position of a given entry in the entire list, measured either from top to bottom or bottom to top. You need to enter the cell address for the entry, the range of entries, and the type of rank (0 for the rank from the highest, or 1 for the rank from the bottom.
Some of these functions overlap; for example, MIN and MAX are both covered by QUARTILE. In other cases, a custom sort or filter might give much the same result. Which you use depends on your temperament and your needs. Some might prefer to use MIN and MAX because they are easy to remember, while others might prefer QUARTILE because it is more versatile.

Using these functions

In some cases, you may be able to get similar results to some of these functions by setting up a filter or a custom sort. However, in general, functions are more easily adjusted than filters or sorts, and provide a wide range of possibilities.

At times, you may just want to enter one or more formula temporarily in a convenient blank square, and delete it once you have satisfied your curiosity. However, if you find yourself using the same functions constantly, you should consider creating a template and including space for all the functions you use, with the cell to their left used as a label for them. Once you have created the template, you can easily update each formula as entries change, either automatically and on-the-fly or pressing the F9 key to update all selected cells. The choice, really, is yours.

No matter how you use these functions, you will probably find them simple to use and adaptable to many purposes. And, by the time you have mastered this handful, you will be ready to try more complex functions.

______________________

-- Bruce Byfield (nanday)

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

sort by date...

Anonymous's picture

my question is can open office(or any other) use a rank function to sort a range of dates and list only the dates that are before the current date? its been driving me crazy.

subtract function

Nedd Ludd's picture

Your reason for not including a subtract functions is, well, not very good. The reason there is no subtract function is because it isn't needed. Subtraction is handled by the sum().

In math "sum" has two meanings. The first is the amount obtained by adding. The other though is the 'whole amount, quantity, number or aggregate'. http://www.thefreedictionary.com/sum

The sum() function finds the 'quantity' not the 'result from adding'.

If one asks the mathematical question, "what is my quantity of apples if I start with 5, add 3 then taking away 4". Using 'sum' logic I can describe the equation as x = 5 + 3 - 4. If 'sum' literally meant 'to add' (and only add) then the equation would be x = 5 + 3 + (-4) and my question would be, "what is the quantity of apples when I start with 5, add 3 then add a deduction of 4".

Nice article.

Does OO Calc have a good way to handle significant digits? I've never know a spreadsheet that works well with significant digits so my question isn't really a slam against OO.

A scoop holds 356 grains of rice.
Each grain of rice weights .0217 grams
How much will 7 scoops weigh?
The correct answer is not 54.0764 when using significant digits.
I "think" the correct answer would be 54.077.

I was first introduced to significant digits in my chemistry class 20 years ago. I lost a grade because I never understood them. I've been on a quest ever since to figure out how to use a computer to help solve my significant digit shortcomings.

Correction to previous

Jukka Lind's picture

Should calculate anything without openoffice...
The sentence in my previous message should be:
"Thus the MEDIAN(1,2,3,100) would be 50.5 and this is not the case."

Bruce, the thing that is

Anonymous's picture

Bruce, the thing that is driving me crazy about OO Calc is how slow it is in doing something it calls 'adjusting row height'. Takes over a minute on an AMD 4000+ with 2G memory to open it. I have a workbook with around 15,000 rows of data in one sheet. This is a record of transactions. I then use an array formula to derive from this a total of items sold and value by month. So my array is about 6 monthly columns, of about 250 rows of items.

Maybe one should not do this in a spreadsheet at all, but at least if you do, a secretary can maintain and understand it.

I am looking at rewriting it in awk, it is so slow. Do you have any idea what is going on?

Definition of MEDIAN-function

Jukka Lind's picture

The definition of MEDIAN-fuction is far from real.
"MEDIAN: The entry that is exactly half way between the highest and lowest number in a list."
Thus the MEDIAN(1,2,3,100) would be 100.5 and this is not the case.

The correct definition of MEDIAN is that equal amout of values are smaller than greater then the MEDIAN value. And the MEDIAN(1,2,3,4,100) is 3.

The function can be used to estimate a set with some error spikes. The spikes cause the AVERAGE-function show value that is untypical to the set.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState