views:

47

answers:

2

Is there a datastore performance difference between adding dynamic properties of the expando class when they are needed for an entity or the simpler (for me) framework of just setting up all possible properties I might need from the start even though most instances will just be left empty.

In my specific case I would be having 5-8 empty ReferenceList properties as 'overhead' that will be empty when I skip using expando class.

A: 

Taking into account that you can swap Model with Expando, I would say that they are only client-side facades for datastore entities, which in turn have no fixed schema. So in the datastore there are neither Models nor Expandos, and the number of properties per objects instance doesn't really matter (in other way than the usual one, i.e. the bigger the object is, the more time it takes to transfer it etc). But if I'm wrong here, please minus and correct me :)

Tomasz Zielinski
I think you are right - there only difference between using fixed properties on db.Model and dynamic properties on db.Expando is the overhead resulting from unused fixed properties being serialized and stored.
David Underhill
+1  A: 

There is a penalty for setting up all the possible properties you might need from the start.

If you use a regular db.Model, then every property will be serialized whenever you put() it. This includes overhead for the name of the property as well as the value. This overhead is present even if the property's value is not required and is set to None! (Though setting the value to None seems to result in a slightly smaller protobuf representation).

On the other hand, if you use db.Expando and don't specify the properties which might not appear, then only the dynamic properties which are actually present on a model will be serialized. Dynamic properties which are not present are not serialized at all => no overhead. However, if you explicitly declare (fixed) properties on the model then you will have the exact same overhead as a regular db.Model (there is no difference in the serialization of fixed properties between regular models and expando models).

In practice, I don't know if the overhead of using fixed properties will be enough to noticeably impact performance, but it will most certainly eat up a little more storage space and CPU time to serialize even empty fixed fields.

If I were you, I'd go with an expando model and dynamic properties.


Example app which demonstrates what I described above:

from google.appengine.ext import db, webapp
from google.appengine.ext.webapp.util import run_wsgi_app

# all model names equal length because they get serialized too
class EmptyModel(db.Model):
    pass

class TestVModel(db.Model):
    value = db.IntegerProperty(required=False)

class TestExpand(db.Expando):
    pass

class MainPage(webapp.RequestHandler):
    def get(self):
        self.response.headers['Content-Type'] = 'text/plain'

        # create an empty model, one with a prop = None, and one with a prop set
        tEmpty = EmptyModel()
        tNone = TestVModel()
        tVal = TestVModel(value=5)

        # do the same but using an expando model with a dynamic property
        eEmpty = TestExpand()
        eNone = TestExpand()
        eNone.value = None
        eVal = TestExpand()
        eVal.value = 5

        # determine the serialized size of each model (note: no keys assigned)
        fEncodedSz = lambda o : len(db.model_to_protobuf(o).Encode())
        szEmpty = fEncodedSz(tEmpty)
        szNone = fEncodedSz(tNone)
        szVal = fEncodedSz(tVal)
        szEEmpty = fEncodedSz(eEmpty)
        szENone = fEncodedSz(eNone)
        szEVal = fEncodedSz(eVal)

        # output the results
        self.response.out.write("Comparison of model sizes with fixed props with expando models\nwith dynamic props:\n\n")
        self.response.out.write("Model:   empty=>%dB  prop=None=>%dB  prop=Val=>%dB\n" %\
                                    (szEmpty, szNone, szVal))
        self.response.out.write("Expando: empty=>%dB  prop=None=>%dB  prop=Val=>%dB\n\n" %\
                                    (szEEmpty, szENone, szEVal))
        self.response.out.write("Note that the expando property which specifies *no* value for the\ndynamic property 'value' is smaller than if 'None' is assigned.")

application = webapp.WSGIApplication([('/', MainPage)])
def main(): run_wsgi_app(application)
if __name__ == '__main__': main()

Output:

Comparison of model sizes with fixed props with expando models
with dynamic props:

Model:   empty=>30B  prop=None=>43B  prop=Val=>45B
Expando: empty=>30B  prop=None=>43B  prop=Val=>45B

Note that the expando property which specifies *no* value for the
dynamic property 'value' is smaller than if 'None' is assigned.
David Underhill