views:

265

answers:

3

Hi i'm coding a python script that will create a number child processes that fetch and execute tasks from the database.

The tasks are inserted on the database by a php website running on the same machine.

What's the good (need this to be fast) way to select and update those tasks as "in progress" to avoid to be selected by multiple times by the python scripts

edit: database is mysql

Thanks in advance

+1  A: 

Without knowing more about your architecture, I suggest the following method.

1) Lock Process table
2) Select ... from Process table where State="New"
3) processlist = [list of process id''s from step 2]
4) Update Process table set State="In progress" where ProcessId in [processlist]
5) Unlock Process table.
dar7yl
The process is very similar to that, but each python script fetch 1 task at a time.I thought on locking the table, but it was looking for a faster method.
arthurprs
put LIMIT 1 in your Select (step 2)I Can't see a way around the lock.
dar7yl
+1  A: 

A way to speed things up is to put the process into a stored procedure, and return the selected row from that procedure. That way, only one trip to the db server.

dar7yl
+1  A: 

Use an InnoDB table Tasks, then:

select TaskId, ... from Tasks where State="New" limit 1;
update Tasks set State="In Progress" where TaskId=<from above> and State="New";

if the update succeeds, you can work on the task. Otherwise, try again.

You'll want an index on TaskId and State.

Keith Randall
Great, this way i can do insertions at the same time? Maybe i can adapt this code into a stored procedure?
arthurprs
You can do insertions of new Tasks rows concurrently, yes. And I bet you could make my code a stored procedure so you only need one round trip to the database (my code takes 2+).
Keith Randall
I wonder what the trade-offs are between lock/select/update/unlock and select/update with multiple keys.
dar7yl