views:

36

answers:

1

Hi,

Do you have any insights into the most elegant way of persisting objects from a dynamic language in a document database?

I have a solid background in C# and have just started programming in Python. At the same time I am trying to learn the ropes of MongoDB.

Now I am wondering: what is the most elegant way to persist my data to the MongoDB database? I have considered several approaches:

  1. Make all my Python classes able to create a graph of dictionaries and lists representing their state. Moreover, make them able to initialize their state from such a graph. When I want to persist an object, I will ask it for its graph representation and persist that. When I want to get an object, I will retrieve a document graph and provide this to the __init__ method of my class.

  2. Create a separate Mapper class capable of inspecting a given object and creating a graph of dictionaries and lists, which I may then store in MongoDB. The mapper would also be responsible for creating objects whose data has been retrieved from the database.

  3. I tried out mongoengine, a document-object mapper. However, I was disappointed when it forced me to derive my classes from a particular class (Document). It reminded me of Microsoft's Entity Framework 1.0 and the lack of POCO support. I don't want to be forced to derive from a particular class. It doesn't feel right, but I am unsure whether this is really a problem in a dynamic language.

Is my thinking being hindered by my background in C#? I am sure I haven't grokked the extent of the flexibility that a dynamic language provides, so any advice or hints at best practices would be greatly appreciated.

Thank you.

A: 

Python defines several special methods such as getstate and many others to allow your classes to define exactly how best to serialize and de-serialize their instances. They're all used internally by the pickle module (which then uses this information to produce a "blob", i.e. a string of bytes, and restore objects from such blobs), but, if you want better indexing obtained by storing graphs directly rather than via opaque blobs, it's basically a question of tweaking the pickle procedures to stop just before turning the graphs into blobs. I think you'll have to do it by copy-paste-edit of pickle.py (as it's not designed to be customized in this way by more elegant methods such as subclassing), but that should still save you lots of work wrt redoing it all from scratch.

I believe this approach lies somewhere between your options 1 and 2 -- classes need to define such special methods only in response to specific needs, and most of the work needed to orchestrate the various possibility will be handled by your pickle-variant (much as it's handled by pickle itself for the "normal" case where the serialized form is a blob).

Alex Martelli