tags:

views:

19

answers:

1

We have a table called worktable which has some columns(key (primary key), ptime, aname, status, content).

We have something called producer which puts in rows in this table and we have consumer which does an order-by on the key column and fetches the first row which has status 'pending'. The consumer does some processing on this row:

  1. updates status to "processing"
  2. does some processing using content
  3. deletes the row

We are facing contention issues when we try to run multiple consumers (probably due to the order-by which does a full table scan).

Using advanced queues would be our next step but before we go there we want to check what the max throughput is that we can achieve with multiple consumers and producer on the table.

What are the optimizations we can do to get the best numbers possible? Can we do an in-memory processing where a consumer fetches 1000 rows at a time processes and deletes? will that improve? What are other possibilities? partitioning of table? parallelization? Index organized tables?...

+1  A: 

The possible optimizations depend a lot on the database used, but a pretty general approach would be to create an index that covers all fields needed to select the correct rows (it sounds like that would be the key and the status in this case). If the index is created correctly (some database need the correct order of the key elements, others don't), then the query should be much faster.

Joachim Sauer