views:

300

answers:

3

I have a python dictionary that I would like to store in Google's BigTable datastore (it is an attribute in a db.Model class).

Is there an easy way to do this? i.e. using a db.DictionaryProperty? Or do I have to use pickle to serialize my dictionary? My dictionary is relatively straight forward. It consists of strings as keys, but it may also contain sub dictionaries for some keys. For example:

{ 
    'myKey' : 100,
    'another' : 'aha',
    'a sub dictionary' : { 'a': 1, 'b':2 }
}

PS: I would like to serialize as binary, not text if possible.

+1  A: 

I think you cannot avoid serializing your objects.

I would define the following model to store each key, value pair:

class DictModel(db.Model):
    value = db.TextProperty()

To save to the datastore I'd use:

def set_value(key, value):
    key = DictModel(value=pickle.dumps(value), key_name=key)
    key.save()
    return key

And to retrieve data:

def get_value(key):
    return pickle.loads(DictModel.get_by_key_name(key).value)
jbochi
+3  A: 

Here's another approach:

class DictProperty(db.Property):
  data_type = dict

  def get_value_for_datastore(self, model_instance):
    value = super(DictProperty, self).get_value_for_datastore(model_instance)
    return db.Blob(pickle.dumps(value))

  def make_value_from_datastore(self, value):
    if value is None:
      return dict()
    return pickle.loads(value)

  def default_value(self):
    if self.default is None:
      return dict()
    else:
      return super(DictProperty, self).default_value().copy()

  def validate(self, value):
    if not isinstance(value, dict):
      raise db.BadValueError('Property %s needs to be convertible '
                             'to a dict instance (%s) of class dict' % (self.name, value))
    return super(DictProperty, self).validate(value)

  def empty(self, value):
    return value is None
jbochi
I did not realize it was so easy to create custom Properties. Thanks a mil, that's perfect!
willem
@willem - You're welcome ;)
jbochi
+1  A: 

I assume that when you need to be able to reach the dict, it's all-at-once? You don't have to get values from inside the dict while it's in the datastore?

If so, you'll have to serialize, but don't have to use pickle; we use simplejson instead. Then retrieving is a simple matter of overriding toBasicType(), sort of like this:

class MyModel(db.Model): #define some properties, including "data" which is a TextProperty containing a biggish dict def toBasicType(self): return {'metadata': self.getMetadata(), 'data': simplejson.loads(self.data)}

Creation involves calling MyModel(...,simplejson.dumps(data),...).

If you're already pickling, that may be your best bet, but simplejson's working pretty well for us.

LH