views:

98

answers:

2

I am working on a project where we have a large number of objects being serialized and stored to disk using pickle/cPickle.

As the life of the project progresses (after release to customers in the field) it is likely that future features/fixes will require us to change the signature of some of our persisted objects. This could be the addition of fields, removing of fields, or even just changing the invariants on a piece of data.

Is there a standard way to mark an object that will be pickled as having a certain version (like serialVersionUID in Java)? Basically, if I am restoring an instance of Foo version 234 but the current code is 236 I want to receive some notification on unpickle. Should I just go ahead and roll out my own solution (could be a PITA).

Thanks

+2  A: 

The pickle format has no such proviso. Why don't you just make the "serial version number" part of the object's attributes, to be pickled right along with the rest? Then the "notification" can be trivially had by comparing actual and desired version -- don't see why it should be a PITA.

Alex Martelli
Yeah, that's the direction I think we are going to take.I think I may have overestimated the level of effort required to add in and check this data. Since we restore all our saved state in one place, adding whatever logic we need (dealing with unversioned objects or objects who were previously unversioned and now are shouldn't be too bad).Just though I would ping the community to see if pickle provided this behaviour and I was reinventing the wheel.
Paul Osborne
A: 

Alex Martelli makes a good point. Also consider that in a key-value system one could simply include a tag like #serialv411 as part of the key, letting the pickled object be the value.

Here's a NoPITA NoSQL implementation which applies to any arbitrary Python object, not just dictionaries useful to represent schemaless data: y_serial.py module :: warehouse Python objects with SQLite

"Serialization + persistance :: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data."

http://yserial.sourceforge.net

Check it out... hope this is helpful for your project.

PS -- big thanks to Alex for his cookbook ;-)

code43