views:

192

answers:

7

Hi Pythonistas,

I have an application that should communicate status information to a server. This information is effectively a large dictionary with string keys.

The server will run a web application based on Turbogears, so the server-side method called accepts an arbitrary number of keyword arguments.

In addition to the actual data, some data related to authentication (id, password..) should be transmitted. One approach would be to simply urlencode a large dictionary containing all this and send it in a request to the server.

urllib2.urlencode(dataPlusId)

But actually, the method doing the authentication and accepting the data set does not have to know much about the data. The data could be transmitted and accepted transparently and handed over to another method working with the data.

So my question is: What is the best way to transmit a large dictionary of data to a server in general? And, in this specific case, what is the best way to deal with authentication here?

+3  A: 

I think the best way is to encode your data in an appropriate transfer format (you should not use pickle, as it's not save, but it can be binary) and transfer it as a multipart post request

What I do not know if you can make it work with repoze.who. If it does not support sign in and function call in one step, you'll perhaps have to verify the credentials yourself.

If you can wrap your data in xml you could also use XML-RPC.

ebo
+2  A: 

Why don't you serialize the dictionary to a file, and upload the file? This way, the server can read the object back into a dictionary .

Geo
+1  A: 

Have you tried using pickle on the data ?

hayalci
You should not transfer pickled object over the network for security reasons
ebo
And how is xml safer?
Geo
You can forge pickled data to crash the unpickler. XML parsers are hardened against that.
ebo
i think that's hilarious.
Geo
It's in the docs: http://docs.python.org/library/pickle.htmlRead the warning...
ebo
paranoia overkill. it sounds like the request must be SSL encrypted and authenticated so only authorised users could even attempt this attack.
SpliFF
I'd rather make sure it never happens. Security by Obscurity is not a solution. It's just better to only use pickle locally and use better protocols for network transfer.
ebo
You can verify the MD5 of the file, and know if something's been tampered with.
Geo
With a MD5 value you send over the same connection? I see it as save to use pickle if you control both ends of the connection. In an webapp the client is normally on a computer controlled by whoever uses your app. You never know what the users is up to. That's why social engineering works. Why take the chance if there are other solutions which are equally easy to implement and much saver.
ebo
ebo is right; you must always assume that some criminal is trying to take over your site. Never assume that all your users will be "good guys". Eventually, someone will find a way to make money from you going to jail.
Aaron Digulla
+2  A: 

Do a POST of your python data (use binary as suggested in other answers) and handle security using your webserver. Apache and Microsoft servers can both do authentication using a wide variety of methods (SSL client certs, Password, System accounts, etc...)

Serialising/Deserialising to text or XML is probably overkill if you're just going to turn it back to dictionary again).

SpliFF
+3  A: 

I agree with all the answers about avoiding pickle, if safety is a concern (it might not be if the sender gets authenticated before the data's unpickled -- but, when security's at issue, two levels of defense may be better than one); JSON is often of help in such cases (or, XML, if nothing else will do...!-).

Authentication should ideally be left to the webserver, as SpliFF recommends, and SSL (i.e. HTTPS) is generally good for that. If that's unfeasible, but it's feasible to let client and server share a "secret", then sending the serialized string in encrypted form may be best.

Alex Martelli
+2  A: 

I'd personally use SimpleJSON at both ends and just post the "file" (it would really just be a stream) over as multipart data.

But that's me. There are other options.

Oli
A: 

Thanks for the suggestions, I will consider them in turn.

D-Bug