ansaurus

Question

Easiest way to persist a data structure to a file in python?

Answer 1

A:

If you want to save it in an easy to read JSON-like format, use repr to serialize the object and eval to deserialize it.

repr(object) -> string

Return the canonical string representation of the object. For most object types, eval(repr(object)) == object.

John Kugelman 2009-06-26 04:26:38

Consider ast.literal_eval() (http://docs.python.org/library/ast.html#ast.literal_eval) as an alternative to eval().

Miles 2009-06-26 04:33:02

The main thing I don't like about this solution is that you have an object in the structure where the eval(repr()) identity doesn't hold, repr() will "succeed" but then eval() will barf.

Miles 2009-06-26 04:37:09

@John You will be pilioried for that answer... were's S.Lott?

mhawke 2009-06-26 04:44:34

pickle, YAML, JSON, etc. are all safer and work with more types than this method. IMO, eval() should be avoided whenever possible.

Jason Creighton 2009-06-26 05:05:19

Heh I should've known to put on my asbestos pants before suggesting eval! It's a fair cop.

John Kugelman 2009-06-26 05:13:03

@Jason: Actually, pickle is not any safer than eval - malicious input can execute code just as easily, and here at least it is obvious that it is doing so, so I think downvoting this is a little unfair. There are other reasons to avoid eval() (eg. only handles objects with evalable repr()s and silently loses data if they don't self-eval, as Miles pointed out), but security wise, it's no worse than pickle.

Brian 2009-06-26 06:57:07

@Brian: Good point, I had not considered that. But it is the case that, of the alternatives I list, pickle and YAML work with more data types than repr()/eval(), and YAML and JSON are safer. So I still think eval() is a bad idea here.

Jason Creighton 2009-06-26 14:46:45

Answer 2

+14 A:

Use the pickle module.

import pickle
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }
afile = open(r'C:\d.pkl', 'wb')
pickle.dump(d, afile)
afile.close()

#reload object from file
file2 = open(r'C:\d.pkl', 'rb')
new_d = pickle.load(file2)
file2.close()

#print dictionary object loaded from file
print new_d

ecounysis 2009-06-26 04:27:36

What's the r in front of the path mean?

Blorgbeard 2009-06-26 04:49:31

Also, that's giving me "TypeError: can't write bytes to text stream" - is it any different for Python 3.0?

Blorgbeard 2009-06-26 04:51:53

The r'' denotes a raw string, described here: http://docs.python.org/reference/lexical_analysis.html#string-literals. Basically, it means that backslashes in the string are included as literal backslashes, not character escapes (though a raw string can't end in a backslash).

Miles 2009-06-26 05:01:13

I've corrected the example—the file needs to be opened in binary mode. It still needs to be for Python 2, but it won't fail as dramatically.

Miles 2009-06-26 05:02:10

Make sure you read the Python documentation (including for the appropriate version) and don't just rely on examples! :) http://docs.python.org/3.0/library/pickle.html (Sorry for the comment spam!)

Miles 2009-06-26 05:05:33

I doubt that, since your original example didn't open afile in write mode. ;) But as for the binary mode, in Python 2, it might work (since the binary flag has basically no effect on Linux and OS X) but is non-portable and can run into trouble on Windows if the resulting file contains newline or DOS EOF characters.

Miles 2009-06-26 05:22:10

Technically pickling will work for text mode files, so long as you're not using a binary pickle format (ie. protocol = 0) and you use it consistently (ie. also use text mode for reading back). Using binary is generally a better idea though, especially if you could be moving data between platforms.

Brian 2009-06-26 06:49:30

Answer 3

+7 A:

Take your pick: Python Standard Library - Data Persistance. Which one is most appropriate can vary by what your specific needs are.

pickle is probably the simplest and most capable as far as "write an arbitrary object to a file and recover it" goes—it can automatically handle custom classes and circular references.

For the best pickling performance (speed and space), use cPickle at HIGHEST_PROTOCOL.

Miles 2009-06-26 04:28:03

Answer 4

+3 A:

Try the shelve module which will give you persistent dictionary, for example:

import shelve
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

shelf = shelve.open('shelf_file')
for key,val in d.items():
    shelf[key] = val

shelf.close()

....

# reopen the shelf
shelf = shelve.open('shelf_file')
print shelf # => {'qwerty': [4, 5, 6], 'abc': [1, 2, 3]}

mhawke 2009-06-26 04:33:30

Answer 5

+1 A:

Just to add to the previous suggestions, if you want the file format to be easily readable and modifiable, you can also use YAML. It works extremely well for nested dicts and lists, but scales for more complex data structures (i.e. ones involving custom objects) as well, and its big plus is that the format is readable.

Eli Bendersky 2009-06-26 04:45:57

Answer 6

+1 A:

JSON has faults, but when it meets your needs, it is also:

simple to use
included in the standard library as the json module
interface somewhat similar to pickle, which can handle more complex situations
human-editable text for debugging, sharing, and version control
valid Python code
well-established on the web (if your program touches any of that domain)

Roger Pate 2009-06-26 05:07:48

Answer 7

+2 A:

You also might want to take a look at Zope's Object Database the more complex you get:-) Probably overkill for what you have, but it scales well and is not too hard to use.

DoxaLogos 2009-06-26 05:15:03

ansaurus

tags:

views:

answers:

Easiest way to persist a data structure to a file in python?

related questions