ansaurus

Question

Doing an atomic update of the first instance in a QuerySet

Answer 1

A:

You have two choices off the top of my head. One is to lock rows immediately upon retrieval and only release the lock once the appropriate one has been marked as in use. The problem here is that no other client process can even look at the jobs which don't get selected. If you're always just automatically selecting the last one then it may be a brief enough of a window to be o.k. for you.

The other option would be to bring back the rows that are open at the time of the query, but to then check again whenever the client tries to grab a job to work with. When a client attempts to update a job to work on it a check would first be done to see if it's still available. If someone else has already grabbed it then a notification would be sent back to the client. This allows all of the clients to see all of the jobs as snapshots, but if they are constantly grabbing the latest one then you might have the clients constantly receiving notifications that a job is already in use. Maybe this is the race condition to which you're referring?

One way to get around that would be to return the jobs in specific groups to the clients so that they are not always getting the same lists. For example, break them down by geographic area or even just randomly. For example, each client could have an ID of 0 to 9. Take the mod of an ID on the jobs and send back those jobs with the same ending digit to the client. Don't limit it to just those jobs though, as you don't want there to be jobs that you can't reach. So for example if you had clients of 1, 2, and 3 and a job of 104 then no one would be able to get to it. So, once there aren't enough jobs with the correct ending digit jobs would start coming back with other digits just to fill the list. You might need to play around with the exact algorithm here, but hopefully this gives you an idea.

How you lock the rows in your database in order to update them and/or send back the notifications will largely depend on your RDBMS. In MS SQL Server you could wrap all of that work nicely in a stored procedure as long as user intervention isn't needed in the middle of it.

I hope this helps.

Tom H. 2010-01-04 15:30:44

The way we did it in the old system, was the locking of approx. 20 rows matching the status='0', selecting the oldest and so on, but this did introduce some race-conditions which were sort of odd, and I've been tasked to eliminate them. Plus, I would really like to do this in a Django-ish manner. We are using PostgreSQL as the DB backend, and if we have to go to custom SQL so be it, but is there a way to do the stuff I want to do while still guaranteeing atomicity?We have the works in categories, meaning only a subset of the workers are accessing any 'set' of 'ToDo' results at any one time.

Paddie 2010-01-04 15:43:21

Answer 2

A:

To merge #2705 into your django, you need to download it first:

cd <django-dir>
wget http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff?format=raw

then rewind svn to the necessary django version:

svn update -r11366

then apply it:

patch -p1 for_update_11366_cdestigter.diff

It will inform you which files were patched successfully and which were not. In the unlikely case of conflicts you can fix them manually looking at http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff

To unapply the patch, just write

svn revert --recursive .

Antony Hatchkins 2010-01-04 15:38:02

Thank you Anthony, this is a great help. I'll try this right away.

Paddie 2010-01-04 16:14:37

Answer 3

A:

If your django is running on one machine, there is a much simpler way to do it... Excuse the pseudo-code as the details of your implementation aren't clear.

from threading import Lock

workers_lock = Lock()

def get_work(request):
    workers_lock.acquire()
    try:
        # Imagine this method exists for brevity
        work_item = WorkItem.get_oldest()
        work_item.result_status = 1
        work_item.save()
    finally:
        workers_lock.release()

    return work_item

kibitzer 2010-01-04 16:56:12

Ha, that is thinking outside the box! Love it! :)

Paddie 2010-01-04 20:49:04

ansaurus

tags:

views:

answers:

Doing an atomic update of the first instance in a QuerySet

related questions