views:

106

answers:

5

If I have a dictionary full of nested stuff, how do I store that in a database, as a string? and then, convert it back to a dictionary when I'm ready to parse?

Edit: I just want to convert it to a string...and then back to a dictionary.

+3  A: 

Why don't you use some serialization/deserialization from pickle module ?

http://docs.python.org/library/pickle.html

anthares
+1 vote for picke
arthurprs
+3  A: 

Options:

1) Pickling

2) XML

3) JSON

others I am sure. It has a lot to do on how much portability means to you.

jldupont
I just want to convert it to a string...and then back to a dictionary.
TIMEX
If you don't care about portability, then go for the pickling (http://docs.python.org/library/pickle.html) functionality. Go for the "cpickle" variant which is faster.
jldupont
+2  A: 

Best, under your stated conditions:

import cPickle
   ...
thestring = cPickle.dumps(thedict, -1)

the -1 ensures the most efficient serialization and produces a binary string (arbitrary string of bytes). If you need an ascii string (because e.g. some Unicode transcoding is going to happen and you can't switch the field's type from, say, TEXT to BLOB), avoid the -1, but you'll then be less efficient.

To get the dict back later from the string, in either case,

thenewdict = cPickle.loads(thestring)
Alex Martelli
this will work with utf-8, right? (Just, default strings, I mean.)
TIMEX
`utf-8` is fine in the keys and values, but there is no guarantee that `thestring` will respect `utf-8` encoding (most likely it won't: it's just bytes!). If that's a problem you must omit the `-1` argument (there's a reason why "do the fastest, most effective, and most concise serialization" is _not_ the default but requires that explicit `-1`... actually there are multiple reasons, but this is one of them;-).
Alex Martelli
For safety reasons, I'm going to avoid the -1 :) and just do normal. I don't care about speed anywayz.
TIMEX
If you don't care about speed, size, and functional completeness (ability to pickle objects whose classes define `__slots__` but not `__getstate__`), the old, legacy protocol (what you get by avoiding the `-1` is fine).
Alex Martelli
+1  A: 

You have two options

  • use a standard serialization format (json, xml, yaml, ...)

    • pros: you can access with a any language that can parse those formats (on the worst case you can write your own parser)
    • cons: could be slower to save and load the data (this depends of the implementation mostly)
  • use cPickle:

    • pros: easy to use, fast and native python way to do serialization.
    • cons: only python based apps can have access to the data.
Felipe
+1  A: 

There are any number of serialization methods out there, JSON is readable, reasonably compact, supported natively, and portable. I prefer it over pickle, since the latter can execute arbitrary code and potentially introduce security holes, and because of its portability.

Depending on your data's layout, you may also be able to use your ORM to directly map the data into database constructs.

Mike Graham