views:

74

answers:

2

As django model save() methods are not lazy, and as keeping transactions short is a general good practice, should saves be preferably deferred to the end of transaction blocks?

As an example, would the code sample B hold a transaction open for less time than code sample A below?

Code sample A:

from django.db import transaction
from my_app.models import MyModel

@transaction.commit_on_success
def model_altering_method():
    for inst in MyModel.objects.all()[0:5000]:
        inst.name = 'Joel Spolsky'
        # Some models independent time consuming operations...
        inst.save()

Code sample B:

from django.db import transaction
from my_app.models import MyModel

@transaction.commit_on_success
def model_altering_method():
    instances_to_save = []
    for inst in MyModel.objects.all()[0:5000]:
        inst.name = 'Joel Spolsky'
        # Some models independent time consuming operations...
        instances_to_save.append(inst)

    for inst in instances_to_save:
        inst.save()
+2  A: 

I'm not sure, but here is my theory - I would think that your commit_manually decorator will begin a new transaction rather than having a new transaction spring into existence when you do your first save. So my theory is that code sample B would keep the transaction open longer, since it has to loop through the list of models twice.

Again, that's just a theory - and it could also depend on which DBMS you're using as to when the actual transaction starts (another theory).

Matthew J Morrison
In many cases, preparing the data for a transaction takes a long while. If as you say, the transaction is opened as soon as the decorated block starts, then there's a motivation to seperate preparation blocks and transaction blocks... If this is the case, then it should be documented in the django documentation under the transactions section.
Jonathan
A: 

Django’s default behavior is to run with an open transaction which it commits automatically when any built-in, data-altering model function is called. In case of commit_on_success or commit_manually decorators, django does not commit upon save(), but rather on function execution successful completion or on transaction.commit() command respectively.

Therefore, the elegant approach would be to separate the transaction handling code and other time consuming code if possible:

from django.db import transaction
from my_app.models import MyModel

@transaction.commit_on_success
def do_transaction(instances_to_save):
    for inst in instances_to_save:
        inst.save()

def model_altering_method():
    instances_to_save = []
    for inst in MyModel.objects.all()[0:5000]:
        inst.name = 'Joel Spolsky'
        # Some models independent time consuming operations...
        instances_to_save.append(inst)
    do_transaction(instances_to_save)

If this is impossible design wise, e.g. you need instance.id information which for new instances you can only get only after the first save(), try breaking up your flow to reasonably sized workunits, as not to keep the transaction open for long minutes.

Also notice that having long transactions is not always a bad thing. If your application is the only entity modifying the db, it could actually be ok. You should however check the specific configuration of your db to see the time limit for transactions (or idle transaction).

Jonathan