tags:

views:

270

answers:

5

I'm working on a grid system which has a number of very powerful computers. These can be used to execute python functions very quickly. My users have a number of python functions which take a long time to calculate on workstations, ideally they would like to be able to call some functions on a remote powerful server, but have it appear to be running locally.

Python has an old function called "apply" - it's mostly useless these days now that python supports the extended-call syntax (e.g. **arguments), however I need to implement something that works a bit like this:

rapply = Rapply( server_hostname ) # Set up a connection
result = rapply( fn, args, kwargs ) # Remotely call the function
assert result == fn( *args, **kwargs ) #Just as a test, verify that it has the expected value.

Rapply should be a class which can be used to remotely execute some arbitrary code (fn could be literally anything) on a remote server. It will send back the result which the rapply function will return. The "result" should have the same value as if I had called the function locally.

Now let's suppose that fn is a user-provided function I need some way of sending it over the wire to the execution server. If I could guarantee that fn was always something simple it could could just be a string containing python source code... but what if it were not so simple?

What if fn might have local dependencies: It could be a simple function which uses a class defined in a different module, is there a way of encapsulating fn and everything that fn requires which is not standard-library? An ideal solution would not require the users of this system to have much knowledge about python development. They simply want to write their function and call it.

Just to clarify, I'm not interested in discussing what kind of network protocol might be used to implement the communication between the client & server. My problem is how to encapsulate a function and its dependencies as a single object which can be serialized and remotely executed.

I'm also not interested in the security implications of running arbitrary code on remote servers - let's just say that this system is intended purely for research and it is within a heavily firewalled environment.

+2  A: 

It sounds like you want to do the following.

  • Define a shared filesystem space.

  • Put ALL your python source in this shared filesystem space.

  • Define simple agents or servers that will "execfile" a block of code.

  • Your client then contacts the agent (REST protocol with POST methods works well for
    this) with the block of code. The agent saves the block of code and does an execfile on that block of code.

Since all agents share a common filesystem, they all have the same Python library structure.

We do with with a simple WSGI application we call "batch server". We have RESTful protocol for creating and checking on remote requests.

S.Lott
The protocol (REST) is slightly outside the scope of this question. Shared python code is a good idea, that is actually the solution we have at the moment. The problem is that in order to get code into the server it has to be shared (hence not completely arbitrary).
Salim Fadhley
@Salim Fadhley: Not what I said. I said the REST protocol contains the code. The code is saved to a file. The file is processed with execfile. Arbitrary code can be sent. Only the dependencies are pre-installed.
S.Lott
Understood, your solution is "send uncompiled python code" - it's exactly what I propose in my question. I'm looking for an alternative which will solve the problem of what if that code contained an import for a user-developed class in another file.
Salim Fadhley
User-developed classes must be installed in the shared directory. Either in `site-packages`, or in a user-specific directory with a `.pth` file in `site-packages`.
S.Lott
A: 

You could use a SSH connection to the remote PC and run the commands on the other machine directly. You could even copy the python code to the machine and execute it.

Ash
But how would you know what to copy? And in what form should it be copied? That is in essence the question. SSHing is outside the scope of the question (see aboive).
Salim Fadhley
+1  A: 

Stackless has ability to pickle and unpickle running code, maybe using that functionality there is possibility to build code transferring solution?

Łukasz
Interesting answer - sadly I am stuck with python 2.4, however this one gets an upvote!
Salim Fadhley
+1  A: 

You could use a ready-made clustering solution like Parallel Python. You can relatively easily set up multiple remote slaves and run arbitrary code on them.

Ali A
+5  A: 

Take a look at PyRO (Python Remote objects) It has the ability to set up services on all the computers in your cluster, and invoke them directly, or indirectly through a name server and a publish-subscribe mechanism.

Jim Carroll