views:

35

answers:

2

I made a c extension out of a python script that was fairly labour intensive. The code itself is well tested and simple. The c extension is called with a few large lists, and it then performs some clever arithmetic and returns a few new lists. The c extension is 100% self sufficient, it doesn't use any other c functions nor does it use any of the python objects' methods (it does use these standard Python methods however: PyFloat_AsDouble, PyList_GetItem, PyList_Size, PyList_New, Py_BuildValue, PyList_Append). Up until now I have only used it in a non-multithreaded environment.

Today I started using it in a multithreaded GUI environment and all hell broke lose. I have a few test cases I use for debugging, and weirdly enough the smaller ones pass through ok while the larger ones cause bus errors and segmentation faults (crashing the GUI completely and bringing up the 'Problem Report For Python' window in OS X). Is the problem that my c extension isn't threadsafe? If so, how can I make it threadsafe? I tried googling the subject, but I haven't really found any good info that I can make sense of. I checked this and this page out, but I don't really understand what they are saying. Which type of code will need the GIL and which won't?

For what it's worth here is the dump:

Date/Time:       2010-10-23 03:48:02.714 +0800
OS Version:      Mac OS X 10.6.4 (10F569)
Report Version:  6

Interval Since Last Report:          323080 sec
Crashes Since Last Report:           60
Per-App Interval Since Last Report:  110157 sec
Per-App Crashes Since Last Report:   59
Anonymous UUID:                      5BD8D75B-9B21-4267-98A4-BAA31E56CB5C

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000b009286c
Crashed Thread:  2

Thread 0:  Dispatch queue: com.apple.main-thread
0   ...ple.CoreServices.CarbonCore  0x90b024c8 ConvertFromUnicodeToTextImplementation + 1976
1   com.apple.HIToolbox             0x951c99e5 CEncodingTranslator::TranslateFromUnicode(char*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, short, short) + 549
2   com.apple.HIToolbox             0x951c9d01 CEncodingTranslator::Translate(char*, unsigned long, unsigned long*, unsigned long*, unsigned long*, unsigned long, unsigned long, short, short, short*, unsigned long) + 101
3   com.apple.HIToolbox             0x951a9e51 TXNGetDataEncoded + 278
4   libwx_macd-2.8.0.dylib          0x0188c7ee wxMacMLTEControl::GetLastPosition() const + 52
5   libwx_macd-2.8.0.dylib          0x0188bf73 wxTextCtrl::SetInsertionPointEnd() + 21
6   libwx_macd-2.8.0.dylib          0x0188bfc9 wxTextCtrl::AppendText(wxString const&) + 25
7   _controls_.so                   0x1397e357 _wrap_TextCtrl_AppendText + 247 (wxPython.h:48)
8   org.python.python               0x000ca58b PyEval_EvalFrameEx + 21147
9   org.python.python               0x000cc4ba PyEval_EvalCodeEx + 2042
10  org.python.python               0x00041ca2 function_call + 162
11  org.python.python               0x0000f375 PyObject_Call + 85
12  org.python.python               0x000c7d5b PyEval_EvalFrameEx + 10859
13  org.python.python               0x000cc4ba PyEval_EvalCodeEx + 2042
14  org.python.python               0x00041ca2 function_call + 162
15  org.python.python               0x0000f375 PyObject_Call + 85
16  org.python.python               0x000c435e PyEval_CallObjectWithKeywords + 78
17  _core_.so                       0x011859f0 wxPyCallback::EventThunker(wxEvent&) + 234 (helpers.cpp:1759)
18  libwx_macd-2.8.0.dylib          0x0180e360 wxEvtHandler::ProcessEventIfMatches(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) + 108
19  libwx_macd-2.8.0.dylib          0x0180e406 wxEvtHandler::SearchDynamicEventTable(wxEvent&) + 80
20  libwx_macd-2.8.0.dylib          0x0180f205 wxEvtHandler::ProcessEvent(wxEvent&) + 225
21  libwx_macd-2.8.0.dylib          0x0180ef4a wxEvtHandler::ProcessPendingEvents() + 86
22  libwx_macd-2.8.0.dylib          0x0176cd02 wxAppConsole::ProcessPendingEvents() + 102
23  libwx_macd-2.8.0.dylib          0x01806873 wxMacProcessNotifierAndPendingEvents + 33
24  libwx_macd-2.8.0.dylib          0x0183107e wxApp::MacHandleOneEvent(void*) + 90
25  libwx_macd-2.8.0.dylib          0x0183110e wxApp::MacDoOneEvent() + 120
26  libwx_macd-2.8.0.dylib          0x0184b570 wxEventLoop::Dispatch() + 32
27  libwx_macd-2.8.0.dylib          0x01906e71 wxEventLoopManual::Run() + 97
28  libwx_macd-2.8.0.dylib          0x018dd364 wxAppBase::MainLoop() + 76
29  _core_.so                       0x0117c75c wxPyApp::MainLoop() + 52 (helpers.cpp:215)
30  _core_.so                       0x011c9e66 _wrap_PyApp_MainLoop + 82 (_core_wrap.cpp:31686)
31  org.python.python               0x000ca58b PyEval_EvalFrameEx + 21147
32  org.python.python               0x000cc4ba PyEval_EvalCodeEx + 2042
33  org.python.python               0x00041ca2 function_call + 162
34  org.python.python               0x0000f375 PyObject_Call + 85
35  org.python.python               0x00021c66 instancemethod_call + 422
36  org.python.python               0x0000f375 PyObject_Call + 85
37  org.python.python               0x000c8ad6 PyEval_EvalFrameEx + 14310
38  org.python.python               0x000cbc88 PyEval_EvalFrameEx + 27032
39  org.python.python               0x000cc4ba PyEval_EvalCodeEx + 2042
40  org.python.python               0x000cc647 PyEval_EvalCode + 87
41  org.python.python               0x000f0ae8 PyRun_FileExFlags + 168
42  org.python.python               0x000f1a23 PyRun_SimpleFileExFlags + 867
43  org.python.python               0x0010a42b Py_Main + 3163
44  org.python.python               0x00001f82 0x1000 + 3970
45  org.python.python               0x00001ea9 0x1000 + 3753

Thread 1:  Dispatch queue: com.apple.libdispatch-manager
0   libSystem.B.dylib               0x96068942 kevent + 10
1   libSystem.B.dylib               0x9606905c _dispatch_mgr_invoke + 215
2   libSystem.B.dylib               0x96068519 _dispatch_queue_invoke + 163
3   libSystem.B.dylib               0x960682be _dispatch_worker_thread2 + 240
4   libSystem.B.dylib               0x96067d41 _pthread_wqthread + 390
5   libSystem.B.dylib               0x96067b86 start_wqthread + 30

Thread 2 Crashed:
0   ccookies.so                     0x0060a949 my_calc + 249 (ccookies.c:23)
1   org.python.python               0x000ca3e0 PyEval_EvalFrameEx + 20720
2   org.python.python               0x000cbc88 PyEval_EvalFrameEx + 27032
3   org.python.python               0x000cbc88 PyEval_EvalFrameEx + 27032
4   org.python.python               0x000cbc88 PyEval_EvalFrameEx + 27032
5   org.python.python               0x000cc4ba PyEval_EvalCodeEx + 2042
6   org.python.python               0x00041ca2 function_call + 162
7   org.python.python               0x0000f375 PyObject_Call + 85
8   org.python.python               0x00021c66 instancemethod_call + 422
9   org.python.python               0x0000f375 PyObject_Call + 85
10  org.python.python               0x000c435e PyEval_CallObjectWithKeywords + 78
11  org.python.python               0x0010c79c t_bootstrap + 76
12  libSystem.B.dylib               0x9606f81d _pthread_start + 345
13  libSystem.B.dylib               0x9606f6a2 thread_start + 34

Thread 2 crashed with X86 Thread State (32-bit):
  eax: 0x0007d090  ebx: 0x0060a85d  ecx: 0x000ef236  edx: 0xb010f920
  edi: 0x02315180  esi: 0xb0092890  ebp: 0xb018d378  esp: 0xb0092870
   ss: 0x0000001f  efl: 0x00010282  eip: 0x0060a949   cs: 0x00000017
   ds: 0x0000001f   es: 0x0000001f   fs: 0x0000001f   gs: 0x00000037
  cr2: 0xb009286c
+1  A: 

cPython is not thread safe. That is the purpose of the GIL, which must be used whenever accessing or modifying the interpreter state.

If you need threading and python, then you will need to use an implementation other than cPython (the standard one), such as IronPython or Jython, both of which are perfectly robust in the case of threading. There are some modified cPython's such as Stackless python that might work better as well.

TokenMacGuy
So are you saying c extensions simply don't work in multithreaded environment? Or that you need to fiddle around with the GIL to make it work?
c00kiemonster
I've never really attempted to get a multithreaded extension to work with cPython, so I don't know the right answer, but yes, you will need to fiddle with the GIL if you wish to interact with the interpreter state from multiple threads.
TokenMacGuy
@c00kiemonster "which must be used whenever *accessing or modifying* the interpreter state"
pst
This answer does not pass my sniff test. The question was about a "simple" C extension being called in a multithreaded Python environmnt. This just says to me that the C code is not written to be reentrant. No reason to start throwing out "can't use cPython" without much more evidence. Question to poster: does your extension use global variables?
Bill Gribble
A: 

I finally managed to get rid of the problem, but in a pretty long winded way. Here goes.

I spent a very very long time trying to make sense of the documentation for c extensions and their thread-safeness. On one of the many google trajectories that night I stumbled on this page describing how to use numpy arrays in c extensions. Since my problems seemed to be performance related (the original c extension worked for smaller datasets) I suspected that my implementation of looping through the python lists and using PyList_GetItem to get the data into their c array counterparts was not up to scratch. (I deduced the following actual number crunching in the c extension wasn't the issue since it was very generic c without any special stuff at all.)

Hence I decided to undertake a complete rewrite of the c extension and my calling python script to use numpy arrays instead of lists. It took a good two days including all the debugging. But now it works like a charm. All datasets are processed ok, there are no signs of any bus errors nor segmentation faults.

TLDR: Use numpy arrays instead of python lists when working with large datasets and python c extensions to avoid bus errors and segmentation faults.

c00kiemonster