tags:

views:

60

answers:

3

Been trying to figure this out for a couple of hours now and have gotten nowhere.

class other(models.Model):
    user = models.ForeignKey(User)


others = other.objects.all()
o = others[0]

At this point the ORM has not asked for the o.user object, but if I do ANYTHING that touches that object, it loads it from the database.

type(o.user)

will cause a load from the database.

What I want to understand is HOW they do this magic. What is the pythonic pixie dust that causes it to happen. Yes, I have looked at the source, I'm stumped.

+1  A: 

This will not explain how exactly Django goes about it, but what you are seeing is Lazy Loading in action. Lazy Loading is a well known design pattern to defer the initialization of objects right up until the point they are needed. In your case until either of o = others[0] or type(o.user) is executed. This Wikipedia article may give you some insights into the process.

Manoj Govindan
Yea, I understand the why, I was looking for the explanation of how they pulled this off.
Mark0978
+8  A: 

Django uses a metaclass (django.db.models.base.ModelBase) to customize the creation of model classes. For each object defined as a class attribute on the model (user is the one we care about here), Django first looks to see if it defines a contribute_to_class method. If the method is defined, Django calls it, allowing the object to customize the model class as it's being created. If the object doesn't define contribute_to_class, it is simply assigned to the class using setattr.

Since ForeignKey is a Django model field, it defines contribute_to_class. When the ModelBase metaclass calls ForeignKey.contribute_to_class, the value assigned to ModelClass.user is an instance of django.db.models.fields.related.ReverseSingleRelatedObjectDescriptor.

ReverseSingleRelatedObjectDescriptor is an object that implements Python's descriptor protocol in order to customize what happens when an instance of the class is accessed as an attribute of another class. In this case, the descriptor is used to lazily load and return the related model instance from the database the first time it is accessed.

# make a user and an instance of our model
>>> user = User(username="example")
>>> my_instance = MyModel(user=user)

# user is a ReverseSingleRelatedObjectDescriptor
>>> MyModel.user
<django.db.models.fields.related.ReverseSingleRelatedObjectDescriptor object>

# user hasn't been loaded, yet
>>> my_instance._user_cache
AttributeError: 'MyModel' object has no attribute '_user_cache'

# ReverseSingleRelatedObjectDescriptor.__get__ loads the user
>>> my_instance.user
<User: example>

# now the user is cached and won't be looked up again
>>> my_instance._user_cache
<User: example>

The ReverseSingleRelatedObjectDescriptor.__get__ method is called every time the user attribute is accessed on the model instance, but it's smart enough to only look up the related object once and then return a cached version on subsequent calls.

jpwatts
+1. I like the explanation.
Manoj Govindan
Thanks, this is exactly what I wanted to learn.
Mark0978
A: 

Properties can be used to implement this behaviour. Basically, your class definition will generate a class similar to the following:

class other(models.Model):
    def _get_user(self):
        ## o.users being accessed
        return User.objects.get(other_id=self.id)

    def _set_user(self, v):
        ## ...

    user = property(_get_user, _set_user)

The query on User will not be performed until you access the .user of an 'other' instance.

Ivo van der Wijk