ansaurus

Question

Python multiprocessing and database access with pyodbc "is not safe"?

Answer 1

+2 A:

The error is raised within the pickle module, so somewhere your DB-Cursor object gets pickled and unpickled (serialized to storage and unserialized to the Python object again).

I guess that pyodbc.Cursor does not support pickling. Why should you try to persist the cursor object anyway?

Check if you use pickle somewhere in your work chain or if it is used implicitely.

Ferdinand Beyer 2009-10-08 13:36:22

Looks like multiprocessing uses it implicitly to pass things through Pipe objects between processes (specifically the Queue objects I created).

tgray 2009-10-08 15:10:13

Answer 2

+3 A:

Multiprocessing relies on pickling to communicate objects between processes. The pyodbc connection and cursor objects can not be pickled.

>>> cPickle.dumps(aCursor)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle Cursor objects
>>> cPickle.dumps(dbHandle)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle Connection objects

"It puts items in the work_queue", what items? Is it possible the cursor object is getting passed as well?

Mark 2009-10-08 14:02:46

I have a generator that loops over items in a cursor (basically calls pyodbyc.Cursor().fetchone()). I believe it yields a tuple of (id, stuff_to_process) which is what I put in the queue. I tried making a deepcopy, but that didn't work. I looked at the help and it's actually an instance of a Row object. So I may need to convert to a tuple first.

tgray 2009-10-08 15:08:21

The Row object must contain a reference to the Cursor or something.

tgray 2009-10-08 15:10:53

Converting the Row to a tuple didn't solve it.

tgray 2009-10-08 15:19:05

Maybe it's because I create the reader and writer in MyManagerClass, and the connection/cursor are created in the __init__(), so technically they're created in the manager process, then piped to their own child process?

tgray 2009-10-08 15:21:35

I'll try moving the instantiation of the connection to the Run method.

tgray 2009-10-08 15:22:14

It worked! I'll update the question to reflect this.

tgray 2009-10-08 15:48:55

ansaurus

tags:

views:

answers:

Python multiprocessing and database access with pyodbc "is not safe"?

related questions