views:

342

answers:

5

I have two computers in geographically dispersed locations, both connected to the internet. On each computer I am running a Python program, and I would like to send and receive data from one to the other. I'd like to use the most simple approach possible, while remaining somewhat secure.

I have considered the following solutions, but I'm not sure which is the simplest:

  • HTTP server and client, using protobuf*;
  • SOAP web service and client (pywebsvcs maybe?);
  • Some sort of IPC over an SSH tunnel -- again, protobuf maybe?

Like I said, I'd like the solution to be somewhat secure, but simplicity is the most important requirement. The data is very simple; object of type A, which contains a list of objects of type B, and some other fields.

*I have used protobuf in the past, so the only difficulty would be setting up the HTTP server, which I guess would be cherrypy.

+2  A: 

The cheapest and simplest way to transmit would probably be XML-RPC. It runs over HTTP (so you can secure it that way), it's in the standard library, and unlike protobuf, you don't have to worry about creating and compiling your data type files (since both ends are running Python, the dynamic typing shouldn't be a problem). The only caveat is that any types not represented in XML-RPC must be pickled or otherwise serialized.

LeafStorm
Yeah, that's gotta be what's most off putting about protobuf; it doesn't seem to be lightweight. I'll check out XML-RPC.
nbolton
Why not simply pickle? `cPickle` is fast.
Antoine P.
@Antoine P. Ah, I already implemented xml-rpc, but I'll try that next time!
nbolton
Pickle is fast, but it doesn't provide an actual transport for the data - just a serialization format. You would have to implement a client and server yourself. XML-RPC provides both serialization and transport.
LeafStorm
A: 

Or you could go right down to the Sockets library and simply transmit the data in your own format.

http://www.amk.ca/python/howto/sockets/

yosser
A: 

You could consider Pyro, be sure to read the Security chapter.

Update: It seems simpler to set up than Protocol Buffers and may require less work if your requirements grow more complex in the future (they have a way of doing that... :-)

Vinay Sajip
Looks nice, but it seems maybe a little too powerful for what I want to do, don't you think?
nbolton
+5  A: 

Protocol buffers are "lightweight" in the sense that they produce very compact wire representation, thus saving bandwidth, memory, storage, etc -- while staying very general-purpose and cross-language. We use them a lot at Google, of course, but it's not clear whether you care about these performance characteristics at all -- you seem to use "lightweight" in a very different sense from this, strictly connected with (mental) load on you, the programmer, and not al all with (computational) load on computers and networks;-).

If you don't care about spending much more bandwidth / memory / etc than you could, and neither do you care about the ability to code the participating subsystems in different languages, then protocol buffers may not be optimal for you.

Neither is pickling, if I read your "somewhat secure" requirement correctly: unpickling a suitably constructed malicious pickled-string can execute arbitrary code on the unpickling machine. In fact, HTTP is not "somewhat secure" in a slightly different sense: there's nothing in that protocol to stop intruders from "sniffing" your traffic (so you should never use HTTP to send confidential payloads, unless maybe you use strong encryption on the payload before sending it and undo that after receiving it). For security (again depending on what meaning you put on the word) you need HTTPS or (simpler to set up, doesn't require you to purchase certificates!-) SSH tunnels.

Once you do have an SSH tunnel established between two machines (for Python, paramiko can help, but even doing it via shell scripts or otherwise by directly controlling the ssh commandline client isn't too bad;-) you can run any protocol on it (HTTP is fine, for example), as the tunnel endpoints are made available as given numbered ports on which you can open socket. I would personally recommend JSON instead of XML for encoding the payloads -- see here for an XMLRPC-like JSON-based RPC server and client, for example -- but I guess that using the XMLRPC server and client that come with Python's standard library is even simpler, thus probably closer to what you're looking for. Why would you want cherrypy in addition? Is performance now suddenly trumping simplicity, for this aspect of the whole architecture only, while in every other case simplicity was picked over performance? That would seem a peculiarly contradictory set of architectural choices!-)

Alex Martelli
Lightweight in this context means "compact representation" to me too. Remember SSH can do on the fly compression too.
gnibbler
@Alex Martelli Haha, yes, I did mean lightweight as in "less effort to implement", rather than "less effort for computer". FYI, I settled on Python's xml-rpc library, as it appeared to be the easiest solution.
nbolton
A: 

Alex is right, of course. But, I'll chime in that I have been very happy in the past with pickling data and pushing it over SSH to another process for unpickling. It's just so easy.

But, it's not suitable for many things. You really need to trust the incoming data, which in the case of my blog server receiving a pickled blog post (my client parses out the tags or the like), I definitely do trust the data -- it's authenticated as me already.

Google, where Alex works, is an entirely different matter. :-)

Sean Reifschneider