ansaurus

Question

How do I store then retrieve Python-native data structures into and from a file?

Answer 1

A:

Pickling is good if you have Python-specific objects to save. If they're just generic data in some basic container type then JSON is fine.

>>> json.dumps(['Chicken', 'Sheep', 'Cattle', 'Horse'])
'["Chicken", "Sheep", "Cattle", "Horse"]'
>>> json.dump(['Chicken', 'Sheep', 'Cattle', 'Horse'], sys.stdout) ; print
["Chicken", "Sheep", "Cattle", "Horse"]
>>> json.loads('["Chicken", "Sheep", "Cattle", "Horse"]')
[u'Chicken', u'Sheep', u'Cattle', u'Horse']

Ignacio Vazquez-Abrams 2010-08-28 05:40:52

Answer 2

+2 A:

The pickle module, or its faster equivalent cPickle, should serve your needs well.

Specifically:

# data_miner.py
import pickle

animals = ['Chicken', 'Sheep', 'Cattle', 'Horse']
population = [150, 200, 50, 30]

with open('data_miner.pik', 'wb') as f:
  pickle.dump([animals, population], f, -1)

and

# plotter.py
import pickle

with open('data_miner.pik', 'rb') as f:
    animals, population = pickle.load(f)

print animals, population

Here, I've made data_miner.py quite explicit regarding what needs to be saved (always an excellent idea to be very explicit unless you have extremely specific reasons to do otherwise). Some things (such as modules and open files) cannot be pickled anyway, so a simple pickling of globals() would not work.

If you absolutely must, you could make a copy of globals() while removing all objects whose types make them unsuitable for saving; or, perhaps better, religiously use a leading _ in every name you don't want to save (so import pickle as _pickle, with open ... as _f, and so forth) and exclude from the copy of globals() all names with a leading underscore == with such an approach, the pickle.load would retrieve a dict, then the variables of interest would be extracted from it by indexing. However, I would strongly recommend the simple alternative of saving a list (or dict, if you want;-) with the specific values that are actually of interest, rather than taking a "wholesale" approach.

Alex Martelli 2010-08-28 05:42:01

+1 for the extra info about `globals()`. Something new for me :)

Kit 2010-10-22 01:02:38

Answer 3

+1 A:

pickle was designed for this. Use pickle.dump to write an object to a file and pickle.load to read it back.

>>> data
{'animals': ['Chicken', 'Sheep', 'Cattle', 'Horse'], 'population': [150, 200, 50, 30]}
>>> f = open('spam.p', 'wb')
>>> pickle.dump(data, f)
>>> f.close()
>>> f = open('spam.p', 'rb')
>>> pickle.load(f)
{'animals': ['Chicken', 'Sheep', 'Cattle', 'Horse'], 'population': [150, 200, 50, 30]}

dan04 2010-08-28 05:42:12

I see that you converted `data` into another structure, which I would like to avoid, since my data is already structured in rather complicated nested lists. Is there a way to make `data_miner.py` do something at the end like `save all variables within my scope` and `store it to some binary file`?

Kit 2010-08-28 05:49:10

You can use `locals()` to get a `dict` containing all the local variables.

dan04 2010-08-28 06:28:22

Answer 4

A:

As already suggested, pickle is usually used here. Keep in mind that not everything is serializable (i.e. files, sockets, database connections).

With simple data structures you can also chose json or yaml. The latter is actually pretty readable and editable.

Ivo van der Wijk 2010-08-28 08:07:21

Incidentally, you can also store complex data structures in JSON and YAML.

Mike Graham 2010-08-28 14:29:36

But not arbitrary objects (except for non-serializable in general) with references (possibly circular), and so on. Right?

Ivo van der Wijk 2010-08-28 14:59:30

ansaurus

tags:

views:

answers:

How do I store then retrieve Python-native data structures into and from a file?

related questions