Embedding Python in Multi-Threaded C/C++ Applications

Python provides a clean intuitive interface to complex, threaded applications.
Enabling Thread Support

Before your threaded C program is able to make use of the Python API, it must call some initialization routines. If the interpreter library is compiled with thread support enabled (as is usually the case), you have the runtime option of enabling threads or not. Do not enable runtime threading support unless you plan on using threads. If runtime support is not enabled, Python will be able to avoid the overhead associated with mutex locking its internal data structures. If you are using Python to extend a threaded application, you will need to enable thread support when you initialize the interpreter. I recommend initializing Python from within your main thread of execution, preferably during application startup, using the following two lines of code:

// initialize Python
Py_Initialize();
// initialize thread support
PyEval_InitThreads();

Both functions return void, so there are no error codes to check. You can now assume the Python interpreter is ready to execute Python code. Py_Initialize allocates global resources used by the interpreter library. Calling PyEval_InitThreads turns on the runtime thread support. This causes Python to enable its internal mutex lock mechanism, used to serialize access to critical sections of code within the interpreter. This function also has the side effect of locking the global interpreter lock. Once the function completes, you are responsible for releasing the lock. Before releasing the lock, however, you should grab a pointer to the current PyThreadState object. You will need this later in order to create new Python threads and to shut down the interpreter properly when you are finished using Python. Use the following bit of code to do this:

PyThreadState * mainThreadState = NULL;
// save a pointer to the main PyThreadState object
mainThreadState = PyThreadState_Get();
// release the lock
PyEval_ReleaseLock();

Creating a New Thread of Execution

Python requires a PyThreadState object for each thread that is executing Python code. The interpreter uses this object to manage a separate interpreter data space for each thread. In theory, this means that actions taken in one thread should not interfere with the state of another thread. For instance, if you throw an exception in one thread, the other snippets of Python code keep running as if nothing happened. You must help Python to manage per-thread data. To do this, manually create a PyThreadState object for each C thread that will execute Python code. In order to create a new PyThreadState object, you need a pre-existing PyInterpreterState object. The PyInterpreterState object holds information that is shared across all cooperating threads. When you initialized Python, it created a PyInterpreterState object and attached it to the main PyThreadState object. You can use this interpreter object to create a new PyThreadState for your own C thread. Here's some example code which does just that (ignore line wrapping):

// get the global lock
PyEval_AcquireLock();
// get a reference to the PyInterpreterState
PyInterpreterState * mainInterpreterState = mainThreadState->interp<\n>;
// create a thread state object for this thread
PyThreadState * myThreadState = PyThreadState_New(mainInterpreterState);
// free the lock
PyEval_ReleaseLock();
Executing Python Code

Now that you have created a PyThreadState object, your C thread can begin to use the Python API to execute Python scripts. You must adhere to a few simple rules when executing Python code from a C thread. First, you must hold the global interpreter lock before doing anything that alters the state of the current thread state. Second, you must load your thread-specific PyThreadState object into the interpreter before executing any Python code. Once you have satisfied these constraints, you can execute arbitrary Python code by using functions such as PyEval_SimpleString. Remember to swap out your PyThreadState object and release the global interpreter lock when done. Note the symmetry of “lock, swap, execute, swap, unlock” in the code (ignore line wrapping):

// grab the global interpreter lock
PyEval_AcquireLock();
// swap in my thread state
PyThreadState_Swap(myThreadState);
// execute some python code
PyEval_SimpleString("import sys\n");
PyEval_SimpleString("sys.stdout.write('Hello from a C thread!\n')\n");
// clear the thread state
PyThreadState_Swap(NULL);
// release our hold on the global interpreter
PyEval_ReleaseLock();
______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Still getting crashes...

Anonymous's picture

Thanks for the article, helped to understand the GIL a little more.
Since python 2.3 you can do the whole GIL lock things with the GILState_Ensure and Release functions. Look at my code:

class CExecuteHandler {
public:
	CExecuteHandler(CHandler *, PyObject *);
	~CExecuteHandler();

	CHandler *handler; 
	/* a class where python functions are saved in a vector*/
	PyObject *args;
};

void _ExecuteHandler(void *_ExecHandler) {
	CExecuteHandler *ExecHandler = (CExecuteHandler *)_ExecHandler;
	CHandler *Handler = ExecHandler->handler;

	for( vector::iterator j = Handler->m_PyFunctions.begin(); 
	     j != Handler->m_PyFunctions.end(); 
	     j++ ) {
		PyGILState_STATE gilState = PyGILState_Ensure();
		PyObject *result = PyObject_CallObject( *j, Thread->args );
		if(!result) PyErr_Print();
		else Py_DECREF(result);
		PyGILState_Release(gilState);
	}

	delete Thread;
#ifdef WIN32
	_endthread();
#else
	pthread_exit(NULL);
#endif
}

void ExecuteHandler(CHandler *i, PyObject *args) {
	CExecuteHandler *ExecHandler = new CExecuteHandler( *i, args );

#ifdef WIN32
	_beginthread( _ExecuteHandler, 0, (void *)ExecHandler );
#else
	pthread_create( &thread, NULL, 
	                _ExecuteHandler, (void*)ExecHandler );
#endif
}

kay this was the code basically. So again the handler class saves a python function pointer of a certain event. E.g. if i want to call a python function when (lets suppose you coded a chat program) some sends a message to others, you call the CHandler fitting to "ChatMessage" with arguments built like Py_BuildValue("(ss)", playerName, message) and call ExecuteHandler(handler, args /* built with above BuildValue */). The problem is then if someone excessively spams and there are many many threads which call the function, the program crashes sometime.

Full code can be seen at:
http://pyghost.googlecode.com

Using PyGILState_Ensure/PyGILState_Release

Gwang-Ho Kim's picture

contructor:
-----------
PyGILState_Ensure ONLY ensure that one thread use the same PyThreadState;
if two threads call PyGILState_Ensure,
one thread might invalidate the PyThreadState of the other WITHOUT locking!
(See the source of Python; Python/pystate.c)
PyGILState_Ensure:

tcur = (PyThreadState *)PyThread_get_key_value(autoTLSkey);
if (tcur == NULL) {
	/* Create a new thread state for this thread */
	tcur = PyThreadState_New(autoInterpreterState);
	if (tcur == NULL)
		Py_FatalError("Couldn't create thread-state for new thread");
	/* This is our thread state!  We'll need to delete it in the
	    matching call to PyGILState_Release(). */
	tcur->gilstate_counter = 0;
	current = 0; /* new thread state is never current */
}
else
	current = PyThreadState_IsCurrent(tcur);
if (current == 0)
	PyEval_RestoreThread(tcur);

Locking is done in PyEval_RestoreThread(See the source in Python/ceval.c),
which called only if current = 0, i.e.,
there is no saved PyThreadState(_PyThreadState_Current in terms of pystate.c).
So one MUST have to call PyEval_SaveThread not just PyEval_ReleaseLock!!!

mainThreadState = PyEval_SaveThread();

Destructor:
-----------
Since there is no explicit PyThreadState in main thread,(See the contructor above.)
one MUST restore PyThreadState of main thread by PyEval_RestoreThread.
Otherwise there is segmentation fault because Py_Finalize use the current PyThreadState!
(See the source of Python; Python/pythonrun.c)
Py_Finalize:

tstate = PyThreadState_GET();
interp = tstate->interp;        // <- At this point.
PyEval_RestoreThread(mainThreadState);

Note that there is one pair; one is PyEval_SaveThread in contructor,
the other PyEval_RestoreThread in destructor.
There is another pair in PyGILState_Ensure(PyEval_RestoreThread) and
PyGILState_Release(PyEval_SaveThread).
The overall structures for multi-threaded Python/C API calling look like:
Main thread:

// Constructor
Py_Initialize();
PyEval_InitThreads();
PyThreadState*  mainThreadState = PyEval_SaveThread();

......
PyGILState_STATE        gilState = PyGILState_Ensure(); // PyEval_RestoreThread
// Call Python/C API...
PyGILState_Release(gilState);                           // PyEval_SaveThread
......

// Create new thread...

......
PyGILState_STATE        gilState = PyGILState_Ensure(); // PyEval_RestoreThread
// Call Python/C API...
PyGILState_Release(gilState);                           // PyEval_SaveThread
......

// Destructor
PyEval_Restore(mainThreadState);
Py_Finalize();

New thread:

......
PyGILState_STATE        gilState = PyGILState_Ensure(); // PyEval_RestoreThread
// Call Python/C API...
PyGILState_Release(gilState);                           // PyEval_SaveThread
......

How does this code look if you use PyGILState_Ensure/Release?

freesteel's picture

I wonder how you implement this example using the PyGILState API that was introduced in version 2.3? Does the PyGILState_Ensure replace this, for example:


...
#idfef USE_GILSTATE
PyGILState* state = PyGILState_Ensure();
#else
// get the global lock
PyEval_AcquireLock();
// get a reference to the PyInterpreterState
PyInterpreterState * mainInterpreterState = mainThreadState->interp;
// create a thread state object for this thread
PyThreadState * myThreadState = PyThreadState_New(mainInterpreterState);
// free the lock
PyEval_ReleaseLock();
#endif

and likewise


...
#ifdef USE_GILSTATE
PyGILState_Release(state);
#else
// grab the lock
PyEval_AcquireLock();
// swap my thread state out of the interpreter
PyThreadState_Swap(NULL);
// clear out any cruft from thread state object
PyThreadState_Clear(myThreadState);
// delete my thread state object
PyThreadState_Delete(myThreadState);
// release the lock
PyEval_ReleaseLock();
#endif // USE_GILSTATE
...

I also found that if you run the original example in version 2.4 and have python compiled with Py_DEBUG defined, you will get fatal errors in pystate.c. The reason is that we can't have more than one thread state per thread:
The exception is thrown from
pystate.c, line 306:
Py_FatalError("Invalid thread state for this thread");

Has anybody else tried it?

agree, (PyGILState_*) is much simpler

vvk's picture

This much more simplier locking model (PyGILState*())was introduced in the python2.3.

In my app each call of the embeded python code is locked by object of this class:

class PythonThreadLocker
{
PyGILState_STATE state;
public:
PythonThreadLocker() : state(PyGILState_Ensure())
{}
~PythonThreadLocker() {
PyGILState_Release(state);
}

};

It works safely. I must confess, that at first I wrote special singleton, which stored interpreted states for each thread (with API which is described in article), and then I found this very handy PyGILState_(Ensure/Realise).

I found this usage on koders.com, while quering "PyGILState_Ensure", thanks for the aiming:)

example.. missing?

F's picture

Seems like the example code contains only code snippets from the article. Am I missing something? :) (i.e. no mentioned "http server with embedded python" thing :))

Done to perfection

Anonymous's picture

Thanks for this useful article. We're embedding into a Win32 C++ multi-threaded app.

Ditto on the above comment -- needed to add a step to shutdown: swap the main thread state back in before shutting down the interpreter.

extending instead of embedding

mathgenius's picture

With python 2.2, I am using an audio library (portaudio) that uses callbacks for
audio buffer filling. This is extending rather than embedding.

First of all:

PyInterpreterState * mis;
PyThreadState * mts;
mts = PyThreadState_Get();
mis = mts->interp;
ts = PyThreadState_New(mis); /* stored away somewhere */

Note: we don't need to PyEval_AcquireLock, as we already have the lock.

Inside the callback:

PyEval_AcquireLock();
PyThreadState_Swap(ts);
/* call python code here */
PyThreadState_Swap(NULL);
PyEval_ReleaseLock();

Finishing up:

PyThreadState_Swap(NULL);
PyThreadState_Clear(ts);
PyThreadState_Delete(ts);

Also, I found it necessary to do

PyEval_InitThreads();

before all the above.

Simon.

Re: extending instead of embedding

Anonymous's picture

Thanks :-)

extending instead of embedding

mathgenius's picture

With python 2.2, I am using an audio library (portaudio) that uses callbacks for
audio buffer filling. This is extending rather than embedding.

First of all:

PyInterpreterState * mis;
PyThreadState * mts;
mts = PyThreadState_Get();
mis = mts->interp;
ts = PyThreadState_New(mis); /* stored away somewhere */

Note: we don't need to PyEval_AcquireLock, as we already have the lock.

Inside the callback:

PyEval_AcquireLock();
PyThreadState_Swap(ts);
/* call python code here */
PyThreadState_Swap(NULL);
PyEval_ReleaseLock();

Finishing up:

PyThreadState_Swap(NULL);
PyThreadState_Clear(ts);
PyThreadState_Delete(ts);

Also, I found it necessary to do

PyEval_InitThreads();

before all the above.

Simon.

Re: extending instead of embedding

mathgenius's picture

ok, i previewed OK this but it ignored pre markers in the final post...

doh!

Re: Embedding Python in Multi-Threaded C/C++ Applications

Anonymous's picture

excellent resource!

Good tutorial, forgot swap to main before Finalize

Anonymous's picture

Shutting down the interpreter should have

// shut down the interpreter

PyEval_AcquireLock();

PyThreadState_Swap(mainThreadState);

Py_Finalize();

otherwise you get this error message and segfault

Fatal Python error: PyThreadState_Get: no current thread

Thanks.

Re: Embedding Python in Multi-Threaded C/C++ Applications

Anonymous's picture

Excellent!

Re: Embedding Python in Multi-Threaded C/C++ Applications

Anonymous's picture

This article is so useful to make sense out of Python's involvement with threads that it should be added to the standard documentation shipping with the language.

It just helped me to sove a problem that I had been wrestling with for 24 hours.

Regards,
Fabien.

Re: Embedding Python in Multi-Threaded C/C++ Applications

Anonymous's picture

thank you - it was stright forward to create an extension with a separate thread and a callback...

it saved quite some time.

Very good article. Helped me

Anonymous's picture

Very good article. Helped me solve a problem i was investigating for two days now.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix