Embedding Python in Multi-Threaded C/C++ Applications
Developers often make use of high-level scripting languages as a way of quickly writing flexible code. Various shell scripting languages have long been used to automate processes on UNIX systems. More recently, software applications have begun to provide scripting layers that allow the user to automate common tasks or even extend the feature set. Think of all the well-known applications you use: GIMP, Emacs, Word, Photoshop, etc. It seems as though all can be scripted in some way.
In this article, I will describe how you can embed the Python language within your C applications. There are many reasons you would want to do this. For instance, you may want to provide your more advanced users with the ability to alter or customize the program. Or maybe you want to take advantage of a Python capability, rather than implement it yourself. Python is a good choice for this task because it provides a clean, intuitive C API. Since many complex applications are written using threads, I will also show you how to create a thread-safe interface to the Python interpreter.
All the examples assume you are using Python version 1.5.2, which comes pre-installed on most recent Linux distributions. The API to access the Python interpreter is the same for both C and C++. There are no special C++ constructs used, and all functions are declared extern “C”. For this reason, the concepts described and the example code given here should work equally well when using either C or C++.
There are two ways that C and Python code can work together within the same process. Simply put, Python code can call C code or C code can call Python code. These two methods are called “extending” and “embedding”, respectively. When extending, you create a new Python module implemented in C/C++. This allows you to provide new functionality to the Python language that cannot be implemented in Python. For instance, several core Python modules such as “time” and “nis” are implemented as C extensions, while others are written in Python. You never notice the difference between C and Python modules, because the act of importing and using these modules is the same. If you look around in your /usr/lib/python1.5 directory, you may see some shared library files (extension .so). These are Python module extensions written in C. You will also see various Python files (extension .py) which are modules written in Python.
Typically, when you embed Python, you will develop a C/C++ application that has the ability to load and execute Python scripts. The application will be linked against the Python interpreter library, called libpython1.5.a, which provides all functionality related to evaluating Python code. There is no Python executable involved, only an API for your application to use.
Embedding Python is a relatively straightforward process. If your goal is merely to execute vanilla Python code from within a C program, it's actually quite easy. Listing 1 is the complete source to a program that embeds the Python interpreter. This illustrates one of the simplest programs you could write making use of the Python interpreter.
Listing 1 uses three Python-specific function calls. Py_Initialize starts up the Python interpreter library, causing it to allocate whatever internal resources it needs. You must call this function before calling most other functions in the Python API. PyEval_SimpleString provides a quick, no-frills way to execute arbitrary Python code. Interpretation of the code is immediate. In the above example, for instance, the import sys line causes Python to import the sys module before returning control to the C/C++ program. Each string passed to PyEval_SimpleString must be a complete Python statement of some kind. In other words, half statements are illegal, even if they are completed with another call to PyRun_SimpleString. For example, the following code will not work properly:
// Python will print first error here PyRun_SimpleString("import ");<\n> // Python will print second error here PyRun_SimpleString("sys\n");<\n>
Py_Finalize is the last Python function which any application that embeds Python must call. This function shuts down the interpreter and frees any resources it allocated during its lifetime. You should call this when you are completely finished using the Python library. When you call Py_Finalize, Python will unload all imported modules one by one. Many modules must execute their own clean-up code when they are unloaded in order to free any global resources they may have allocated. For this reason, calling Py_Finalize can have the side effect of causing quite a bit of other code to run.
PyEval_SimpleString is just one way to execute Python code from within your C applications. In fact, there is a whole collection of similar high-level functions. PyEval_SimpleFile is just like PyEval_SimpleString, except it reads its input from a FILE pointer rather than a character buffer. See the Python documentation at www.python.org/docs/api/veryhigh.html for complete documentation on these high-level functions.
In addition to evaluating Python scripts, you can also manipulate Python objects and call Python functions directly from your C code. While this involves more complex C code than using PyEval_SimpleString, it also allows access to more detailed information. For example, you can access objects returned from Python functions or determine if an exception has been thrown.