# SciPY for Scientists

In my last article, I looked at NumPY and some of its uses in numerical simulations. Although NumPY does provide some really robust building blocks, it is a bit lacking in more sophisticated tools. SciPY is one of the many Python modules that build on NumPY's. In fact, SciPY has become sort of the de facto science package in Python programming. If you have a scientific problem you are trying to solve, you could do worse than starting with SciPY. Not only are there more advanced functions and objects available to do linear algebra, but there also are functions and objects to handle calculus, interpolation, signal processing and Fast Fourier Transforms, among others. So many functions are available, they actually are grouped together into sub-packages. In this article, I take a quick look at what sorts of functions are available and how to use them to get some serious work done.

To start, you need to import the main scipy module. You would do this with the usual:

```
import scipy
```

This imports the common set of functions and objects used in SciPY. It also imports the most-used parts of NumPY, because they are so fundamental to the work for which SciPY is used. If you need anything else from NumPY, you need to import the NumPY module explicitly. In many cases, that is something you will want to do anyway. All of the extra functions in the individual sub-packages need to be imported explicitly. So, if you want to do some signal processing, you would need to use this:

```
from scipy import signal
```

The simplest package in SciPY probably is the constants sub-package. This package provides a basic set of physical constants that are most used, like pi or Avogadro's number. It also includes a much larger set of constants from the 2010 CODATA database. These physical constants are stored as a tuple of value, unit and uncertainty, and they include items as diverse as the alpha particle mass to the Wien wavelength displacement law constant. The scipy.misc sub-package contains all of those bits and pieces that don't really fit anywhere else. Here, you can find functions like factorial (to calculate the factorial of a number) and imread (to read an image file into Python).

Linear algebra is one of the heavy uses of computational code. SciPY includes a sub-package called linalg, which is a wrapper for the package linalg within NumPY. All of the functionality from NumPY is included in scipy.linalg, along with several other functions. In the NumPY module, these linear algebra functions may or may not be handled by external libraries, depending on how NumPY was compiled. With SciPY, this is no longer an option. It needs to be compiled with the ATLAS LAPACK and BLAS libraries to handle the actual numerical work in an optimized fashion. There are functions to handle things like finding an inverse, determinant or transpose of a matrix. If you need to solve a system of equations, you can do so with a single function call. If you start with a coefficient matrix, A, and a right-hand side vector, b, you can find the solution vector for your system with:

```
from scipy import linalg
linalg.solve(A,b)
```

In many physics and engineering problems, you need to find eigenvalues and eigenvectors. The linalg sub-package provides very fast functions for doing that as well.

Most people default to using R to do statistics, but you don't have to. SciPY includes a stats sub-package that provides many of the functions you will need in the majority of cases. The describe function will give you the basic statistical description of a vector of samples. This includes the mean, variance, skew and kurtosis. Once you have some basic statistics, you probably will want to run a t-test to see how well your data matches your model. You can do this with something like:

```
stats.ttest_1samp(x, m)
```

where x is your data and m is your model. This will give you a t-statistic and a p-value. Just as in R, there are many more complicated statistical functions available to you.

A topic near and dear to my heart is solving differential equations. SciPY can help with that task too. The sub-package you need is named integrate. There are two sets of functions, one that takes a function object as the input and one that takes a set of fixed samples. You can do single, double and triple integrations on a function object with the functions quad, dblquad and tplquad. If you have data from some experiment, you integrate it with the trapezoidal rule, Simpson's rule or Romberg Integration. If you are working with ordinary differential equations, some special functions are available. The function odeint will solve a set of ordinary differential equations with a given set of initial conditions.

Last, but not least, let's look at the weave sub-package. Even though SciPY already is full-featured, it can't cover every eventuality. Although you always can write the code in pure Python for whatever piece is missing, sometimes you need to squeeze every last cycle out of your hardware. In those cases, you probably want to write some optimized C code to do the heavy lifting. Although you could write this and compile it as an external object file, that is far too much work for any self-respecting programmer. Enter the weave sub-package.

With weave, you can add C code from within your Python program in a number of ways. The most direct is the inline function. With this, you can write out your C or C++ code, compile it and run it directly within your Python program. All of your Python objects are available within the scope of your inlined code. The contents of any mutable objects are changeable from within your C/C++ code. If you want to return results to your Python program, these are available in a special variable called "return_val". A trivial example, from the SciPY documentation, uses printf to show how the inline function works:

```
import weave
a = 1
weave.inline('printf("%d\\n",a);',['a'])
```

The general form for the inline function is a string containing the code to compile and run, and a list of the Python variables to make available to the C/C++ code. If you have a larger fragment of code you want to inline, you can use triple quotes to define a code block and save it to a variable first. For example, you may have something like:

```
code = """
for (int i=0; i<a; i++) {
printf("%d\\n", i);
}"""
weave.inline(code, ['a'])
```

Another way to speed up your code is to let Python do it for you with the blitz function. In this case, blitz takes some NumPY expression and creates C++ code and compiles it to an external module. The first time you do this, it may take several minutes to generate the code and compile it. Once this is done, the compiled object file is stored to be reused the next time it is called. Now you can see a speedup of 2–10 over just straight Python code. It is also saved after Python closes, so you can reuse it the next time you run your Python code.

Now you have some tools available to do some real scientific computations. In my next article, I'll look at matplotlib, one of the ways available to visualize all of this computational work you have been doing. Until then, get some science done.

Joey Bernard has a background in both physics and computer science. This serves him well in his day job as a computational research consultant at the University of New Brunswick. He also teaches computational physics and parallel programming.

## Trending Topics

Happy Birthday Linux | Aug 25, 2016 |

ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs | Aug 24, 2016 |

Updates from LinuxCon and ContainerCon, Toronto, August 2016 | Aug 23, 2016 |

NVMe over Fabrics Support Coming to the Linux 4.8 Kernel | Aug 22, 2016 |

What I Wish I’d Known When I Was an Embedded Linux Newbie | Aug 18, 2016 |

Pandas | Aug 17, 2016 |

- Happy Birthday Linux
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- ContainerCon Vendors Offer Flexible Solutions for Managing All Your New Micro-VMs
- What I Wish I’d Known When I Was an Embedded Linux Newbie
- Updates from LinuxCon and ContainerCon, Toronto, August 2016
- New Version of GParted
- NVMe over Fabrics Support Coming to the Linux 4.8 Kernel
- All about printf
- Tor 0.2.8.6 Is Released
- Tech Tip: Really Simple HTTP Server with Python

## Geek Guides

With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.

This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.

Get the Guide