views:

390

answers:

7

Let's say I have something like this:

d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

What's the easiest way to progammatically get that into a file that I can load from python later?

Can I somehow save it as python source (from within a python script, not manually!), then import it later?

Or should I use JSON or something?

A: 

If you want to save it in an easy to read JSON-like format, use repr to serialize the object and eval to deserialize it.

repr(object) -> string

Return the canonical string representation of the object. For most object types, eval(repr(object)) == object.

John Kugelman
Consider ast.literal_eval() (http://docs.python.org/library/ast.html#ast.literal_eval) as an alternative to eval().
Miles
The main thing I don't like about this solution is that you have an object in the structure where the eval(repr()) identity doesn't hold, repr() will "succeed" but then eval() will barf.
Miles
@John You will be pilioried for that answer... were's S.Lott?
mhawke
pickle, YAML, JSON, etc. are all safer and work with more types than this method. IMO, eval() should be avoided whenever possible.
Jason Creighton
Heh I should've known to put on my asbestos pants before suggesting eval! It's a fair cop.
John Kugelman
@Jason: Actually, pickle is not any safer than eval - malicious input can execute code just as easily, and here at least it is obvious that it is doing so, so I think downvoting this is a little unfair. There are other reasons to avoid eval() (eg. only handles objects with evalable repr()s and silently loses data if they don't self-eval, as Miles pointed out), but security wise, it's no worse than pickle.
Brian
@Brian: Good point, I had not considered that. But it is the case that, of the alternatives I list, pickle and YAML work with more data types than repr()/eval(), and YAML and JSON are safer. So I still think eval() is a bad idea here.
Jason Creighton
+14  A: 

Use the pickle module.

import pickle
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }
afile = open(r'C:\d.pkl', 'wb')
pickle.dump(d, afile)
afile.close()

#reload object from file
file2 = open(r'C:\d.pkl', 'rb')
new_d = pickle.load(file2)
file2.close()

#print dictionary object loaded from file
print new_d
ecounysis
What's the r in front of the path mean?
Blorgbeard
Also, that's giving me "TypeError: can't write bytes to text stream" - is it any different for Python 3.0?
Blorgbeard
The r'' denotes a raw string, described here: http://docs.python.org/reference/lexical_analysis.html#string-literals. Basically, it means that backslashes in the string are included as literal backslashes, not character escapes (though a raw string can't end in a backslash).
Miles
I've corrected the example—the file needs to be opened in binary mode. It still needs to be for Python 2, but it won't fail as dramatically.
Miles
Make sure you read the Python documentation (including for the appropriate version) and don't just rely on examples! :) http://docs.python.org/3.0/library/pickle.html (Sorry for the comment spam!)
Miles
I doubt that, since your original example didn't open afile in write mode. ;) But as for the binary mode, in Python 2, it might work (since the binary flag has basically no effect on Linux and OS X) but is non-portable and can run into trouble on Windows if the resulting file contains newline or DOS EOF characters.
Miles
Technically pickling will work for text mode files, so long as you're not using a binary pickle format (ie. protocol = 0) and you use it consistently (ie. also use text mode for reading back). Using binary is generally a better idea though, especially if you could be moving data between platforms.
Brian
+7  A: 

Take your pick: Python Standard Library - Data Persistance. Which one is most appropriate can vary by what your specific needs are.

pickle is probably the simplest and most capable as far as "write an arbitrary object to a file and recover it" goes—it can automatically handle custom classes and circular references.

For the best pickling performance (speed and space), use cPickle at HIGHEST_PROTOCOL.

Miles
+3  A: 

Try the shelve module which will give you persistent dictionary, for example:

import shelve
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

shelf = shelve.open('shelf_file')
for key,val in d.items():
    shelf[key] = val

shelf.close()

....

# reopen the shelf
shelf = shelve.open('shelf_file')
print shelf # => {'qwerty': [4, 5, 6], 'abc': [1, 2, 3]}
mhawke
+1  A: 

Just to add to the previous suggestions, if you want the file format to be easily readable and modifiable, you can also use YAML. It works extremely well for nested dicts and lists, but scales for more complex data structures (i.e. ones involving custom objects) as well, and its big plus is that the format is readable.

Eli Bendersky
+1  A: 

JSON has faults, but when it meets your needs, it is also:

  • simple to use
  • included in the standard library as the json module
  • interface somewhat similar to pickle, which can handle more complex situations
  • human-editable text for debugging, sharing, and version control
  • valid Python code
  • well-established on the web (if your program touches any of that domain)
Roger Pate
+2  A: 

You also might want to take a look at Zope's Object Database the more complex you get:-) Probably overkill for what you have, but it scales well and is not too hard to use.

DoxaLogos