views:

832

answers:

2

Consider the following skeleton of a models.py for a space conquest game:

class Fleet(models.Model):
    game = models.ForeignKey(Game, related_name='planet_set')
    owner = models.ForeignKey(User, related_name='planet_set', null=True, blank=True)
    home = models.ForeignKey(Planet, related_name='departing_fleet_set')
    dest = models.ForeignKey(Planet, related_name='arriving_fleet_set')
    ships = models.IntegerField()

class Planet(models.Model):
    game = models.ForeignKey(Game, related_name='planet_set')
    owner = models.ForeignKey(User, related_name='planet_set', null=True, blank=True)
    name = models.CharField(max_length=250)
    ships = models.IntegerField()

I have many such data models for a project I'm working on, and I change the state of the game based on somewhat complicated interactions between various data objects. I want to avoid lots of unnecessary calls to the database, so once per turn, I do something like

  1. Query all the fleets, planets, and other objects from the database and cache them as python objects
  2. Process the game objects, resolving the state of the game
  3. Save them back in the database

This model seems to totally break down when using ForeignKey objects. For example, when a new fleet departs a planet, I have a line that looks something like this:

fleet.home.ships -= fleet.ships

After this line runs, I have other code that alters the number of ships at each of the planets, including the planet fleet.home. Unfortunately, the changes made in the above line are not reflected in the QuerySet of planets that I obtained earlier, so that when I save all the planets at the end of the turn, the changes to fleet.home's ships get overwritten.

Is there some better way of handling this situation? Or is this just how all ORMs are?

+9  A: 

Django's ORM does not implement an identity map (it's in the ticket tracker, but it isn't clear if or when it will be implemented; at least one core Django committer has expressed opposition to it). This means that if you arrive at the same database object through two different query paths, you are working with different Python objects in memory.

This means that your design (load everything into memory at once, modify a lot of things, then save it all back at the end) is unworkable using the Django ORM. First because it will often waste lots of memory loading in duplicate copies of the same object, and second because of "overwriting" issues like the one you're running into.

You either need to rework your design to avoid these issues (either be careful to work with only one QuerySet at a time, saving anything modified before you make another query; or if you load several queries, look up all relations manually, don't ever traverse ForeignKeys using the convenient attributes for them), or use an alternative Python ORM that implements identity map. SQLAlchemy is one option.

Note that this doesn't mean Django's ORM is "bad." It's optimized for the case of web applications, where these kinds of issues are rare (I've done web development with Django for years and never once had this problem on a real project). If your use case is different, you may want to choose a different ORM.

Carl Meyer
Nicely summarized
Jarret Hardie
Yes, thank you very much for this informative response. I know that it doesn't mean that Django's ORM is bad; in fact, I've developed an entire project using Django that doesn't require this sort of complicated data processing, and this issue has never come up, which is why I was at such a loss.
Bobby Moretti
+1  A: 

This is perhaps what you are looking for:

http://simonwillison.net/2009/May/7/mmalones/

Ólafur Nielsen