views:

540

answers:

6

I'm trying to transfer a transfer a function across a network connection (using asyncore). Is there an easy way to serialize a python function (one that, in this case at least, will have no side affects) for transfer like this?

I would ideally like to have a pair of functions similar to these:

def transmit(func):
    obj = pickle.dumps(func)
    [send obj across the network]

def receive():
    [receive obj from the network]
    func = pickle.loads(s)
    func()
+3  A: 

The most simple way is probably inspect.getsource(object) (see the inspect module) which returns a String with the source code for a function or a method.

Aaron Digulla
This looks good, except that the function name is explicitly defined in the code, which is slightly problematic. I could strip the first line of the code off, but that's breakable by doing something like 'def \/n func():'. I could pickle the name of the function with the function itself, but I'd have no guarantees that the name wouldn't collide, or I'd have to put the function in a wrapper, which is still not the cleanest solution, but it might have to do.
Michael Fairley
Note that the inspect module is actually just asking the function where it was defined, and then reading in those lines from the source code file - hardly sophisticated.
too much php
You can find out the function's name using its .__name__ attribute. You could do a regex replace on ^def\s*{name}\s*( and give it whatever name you like. It's not foolproof, but it will work for most things.
too much php
A: 

It will work if you keep it simple. Read the section "What can be pickled and unpickled?" for details.

>>> import pickle
>>>
>>> def hi():
...     print 'hello'
...
>>>
>>> hi()
hello
>>> p_func = pickle.dumps(hi)
>>>
>>> loaded_func = pickle.loads(p_func)
>>>
>>> loaded_func()
hello

As Aaron Digulla pointed at, you may just want to consider sending the source file of the function.

Something like:

transmit(<source file>, <function name in source file>)
trasmit('./my_functions.py', 'hi')

As your probably aware you don't want to do this out of a completely controlled environment.

monkut
This actually doesn't work. The pickling merely makes a reference to the name of the function, so it works as long as the function that's been pickled in still around, but the actual pickle string is not useful in another instance of python.
Michael Fairley
That's right I new I was forgeting something :P
monkut
+5  A: 

Pyro can do this for you.

RichieHindle
I'd need to stick with the standard library for this particular project.
Michael Fairley
But that doesn't mean you can't *look* at the code of Pyro to see how it is done :)
Aaron Digulla
+12  A: 

You could serialise the function bytecode and then reconstruct it on the caller. The marshal module can be used to serialise code objects, which can then be reassembled into a function. ie:

import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.func_code)

Then in the remote process (after transferring code_string):

import marshal, types

code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")

func(10)  # gives 100

A few caveats:

  • marshal's format (any python bytecode for that matter) may not be compatable between major python versions.

  • Will only work for cpython implementation.

  • If the function references globals (including imported modules, other functions etc) that you need to pick up, you'll need to serialise these too, or recreate them on the remote side. My example just gives it the remote process's global namespace.

  • You'll probably need to do a bit more to support more complex cases, like closures or generator functions.

Brian
In Python 2.5, the "new" module is deprecated. 'new.function' should be replaced by 'types.FunctionType', after an "import types", I believe.
EOL
@EOL: Good point - I've updated the code to use the types module instead.
Brian
Thanks. This is exactly what I was looking for. Based on some cursory testing, it works as is for generators.
Michael Fairley
A: 

It all depends on whether you generate the function at runtime or not:

If you do - inspect.getsource(object) won't work for dynamically generated functions as it gets object's source from .py file, so only functions defined before execution can be retrieved as source.

And if your functions are placed in files anyway, why not give receiver access to them and only pass around module and function names.

The only solution for dynamically created functions that I can think of is to construct function as a string before transmission, transmit source, and then eval() it on the receiver side.

Edit: the marshal solution looks also pretty smart, didn't know you can serialize something other thatn built-ins

kurczak
A: 

The basic functions used for this module covers your query, plus you get the best compression over the wire; see the instructive source code:

y_serial.py module :: warehouse Python objects with SQLite

"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."

http://yserial.sourceforge.net

code43