views:

37

answers:

2

I'm using Django with an sqlite backend, and write performance is a problem. I may graduate to a "proper" db at some stage, but for the moment I'm stuck with sqlite. I think that my write performance problems are probably related to the fact that I'm creating a large number of rows, and presumably each time I save() one it's locking, unlocking and syncing the DB on disk.

How can I aggregate a large number of save() calls into a single database operation?

+1  A: 

"How can I aggregate a large number of save() calls into a single database operation?"

You don't need to. Django already manages a cache for you. You can't improve it's DB caching by trying to fuss around with saves.

"write performance problems are probably related to the fact that I'm creating a large number of rows"

Correct.

SQLite is pretty slow. That's the way it is. Queries are faster than most other DB's. Writes are pretty slow.

Consider more serious architecture change. Are you loading rows during a web transaction (i.e., bulk uploading files and loading the DB from those files)?

If you're doing bulk loading inside a web transaction, stop. You need to do something smarter. Use celery or use some other "batch" facility to do your loads in the background.

We try to limit ourself to file validation in a web transaction and do the loads when the user's not waiting for their page of HTML.

S.Lott
Although it's fair to point out that sqlite is indeed inherently slow on writes due to the way it uses disk, it doesn't have to be *this* slow: as suggested by a commentor, I started using the @commit_manually decorator and found a really substantial improvement.
kdt
+2  A: 

Actually this is easier to do then you think. You can use "transactions" in Django. These "batch" database operations (specifically save, insert and delete) into one operation. I've found the easiest one to use is commit_on_success. Essentially you wrap your database save operations into a function and then use the commit_on_success decorator.

from django.db.transaction import commit_on_success

@commit_on_success
def lot_of_saves(queryset):
    for item in queryset:
        modify_item(item)
        item.save()

This will have a huge speed increase. You'll also get the benefit of having roll-backs if any of the items fail. If you have millions of save operations then you may have to commit them in blocks using the commit_manually and transaction.commit() but I've rarely needed that.

Hope that helps,

Will

JudoWill