views:

153

answers:

7

I two programs running, one in Python and one in C++, and I need to share a two-dimensional array (just of decimal numbers) between them. I am currently looking into serialization, but pickle is python-specific, unfortunately. What is the best way to do this?

Thanks

Edit: It is likely that the array will only have 50 elements or so, but the transfer of data will need to occur very frequently: 60x per second or more.

+4  A: 

You could try using boost::python to make your applications interoperable.

Some information about pickle support and plain boost::python documentation.

Kotti
+3  A: 

You could try hosting the array in a Memory-mapped file, although you will need to synchronize access to the file to avoid race conditions.

Alternatively you could establish a socket (or pipe) connection between both processes and pass values by exchanging messages.

Justin Ethier
+1  A: 

How large is this array? If it isn't very large, then JSON serialization is a good fit. There are libraries readily available for C++, and Python has JSON serialization in its standard library as of version 2.6. See http://www.json.org/ for more info.

A. Levy
Interesting idea. Where would you draw the cutoff for 'very large'? It is likely that the array will only have 50 elements or so, but the transfer of data will need to occur very frequently: 60x per second or more.
mellort
JSON serialization and parsing is pretty slow (relative to the other suggestions), and doesn't give you a means of synchronizing the data between C++ and Python, which is the more difficult part of this problem. Even though json is slower, you should have no problem using JSON on 50 elements, 60x per second.
mikerobi
+3  A: 

Your case is handled very well by PyUblas, a bridge between Numpy and Boost.Ublas using Boost.Python. This bridge supports copy-free transfer of vectors and matrices and is very easy to use.

Philipp
+4  A: 

I suggest Google's protobuf

OscarRyz
How slow is the parsing relative to, say JSON?
mellort
+1  A: 

I would propose simply to use c arrays(via ctypes on the python side) and simply pull/push the raw data through an socket

nils
+1  A: 

Serialization is one problem while IPC is another. Do you have the IPC portion figured out? (pipes, sockets, mmap, etc?)

On to serialization - if you're concerned about performance more than robustness (being able to plug more modules into this architecture) and security, then you should take a look at the struct module. This will let you pack data into C structures using format strings to define the structure (takes care of padding, alignment, and byte ordering for you!) In the C++ program, cast a pointer to the buffer to the corresponding structure type.

This works well with a tightly-coupled Python script and C++ program that is only run internally.

Jeremy Brown
Could you elaborate on 'In the C++ program, ... type.'? I'm unfamiliar with working with buffers.
mellort
@mellort - suppose you receive a buffer (char*/void*,etc) via socket in the C++ app. The Python `struct` format string you used to pack the data before sending will basically correspond to the C/C++ structure definition that you use to cast the buffer once it is received in C++.
Jeremy Brown