ansaurus

Question

Is there a standard pattern for scanning a job table executing some actions?

Answer 1

+1 A:

In order to scale, you might want to consider scanning for jobs that are ready then adding them to a message queue. This way multiple consumers can read ready jobs off the queue. Marking jobs as "in progress" could be as simple as putting that value in the Completed column, or you could add a TimeStarted column and have a pre-determined timeout period before a job will be reset and be eligible for another worker thread to process. (The latter approach assumes the processing failed if the time elapses without the job completing. Failing after some number of attempts should call for manual inspection of that job.) The same daemon process that scans the database for ready jobs to add to the queue can look for jobs that have timed out.

Bill the Lizard 2010-03-13 01:46:52

@Bill - How would I determine which records a given consumer reads? If I want to support multiple consumers then I probably wouldn't want each of them to grab *all* non-completed records since this would essentially limit the thing to one consumer. In other words, if a given consumer did a "SELECT ID from table WHERE Completed = No" and then immediately set those records to "in progress", there'd be no records for another consumer to grab.

Howiecamp 2010-03-13 02:55:36

@Bill - maybe I'd just have the consumer pick a "reasonable" number of records per run?

Howiecamp 2010-03-13 03:02:49

From the point of view of the queue, the daemon thread would be a producer. Its job would be to periodically check the DB for jobs ready to go and add them all to the queue. Then you have multiple consumers who check the queue and each grab one job at a time to process.

Bill the Lizard 2010-03-13 03:22:16

Answer 2

+1 A:

If you're willing to consider non-database technologies, the best (though not the only) solution is message queuing (often in conjunction with a database that contains each job's details). Message queues provide a lot of functionality, but the basic workflow is simple:

1) One process puts a 'job message' (perhaps just an id) on a queue.

2) Another process keeps an eye on the queue. It polls the queue for work, and pulls jobs it finds off the queue, one at a time, in the order they were received. Items you've pulled off the queue are effectively marked as 'in progress' - they are no longer available to other processes.

3) For critical workflows, you can perform a transactional read - in the event of a system failure, the transaction rolls back and the message is still on the queue. If there's some other kind of exception (like a timeout during a database read), you might just forward the message to a special error queue.

The simplest way to scale this is to have your reader process dispatch multiple threads to handle jobs it pulls off the queue. Alternately, you can scale out using multiple reader processes, which may be on separate servers.

.NET support includes Microsoft Message Queue, and either Windows Communication Foundation or the classes in the System.Messaging namespace. It requires some setup and configuration (you have to create the queues and configure permissions), but it's worth it.

Jeff Sternal 2010-03-13 01:48:37

+1 for recommending a specific Queue. I didn't because I'm a Java programmer and I didn't think a recommendation of JMS would be much appreciated on a question tagged `c#`. :)

Bill the Lizard 2010-03-13 02:06:53

@Jeff - With respect to #2 though, I think this is the same thing I'm asking in the question. In other words, the logic of polling the queue and picking off an item, while supporting concurrency, is the problem I'm trying to solve, without having lots of contention and/or locking issues, etc. With regard to your statement, "Alternately, you can scale out using multiple reader processes, which may be on separate servers." I guess it's back to how I'd do that while avoiding . So it's more of the "how" I'd implement these options while avoiding the above issues.

Howiecamp 2010-03-13 02:58:25

Sorry my answer wasn't clear about that! Message queues support concurrency in this sense - they synchronize access to queue contents: only one process can read a given message at a time. It's one of the defining features of message queues and one of the main reasons to use them instead of writing code to pull jobs from a database table.

Jeff Sternal 2010-03-13 03:01:40

@Jeff - No problem and great point. The message queue implementation (by definition) will manage issues like arbitrating locking and concurrency issues so the clients don't have to worry about it.

Howiecamp 2010-03-13 03:04:39

Answer 3

+1 A:

If you're using SQL 2005+, you may want to investigate Service Broker. It's pretty much designed for this.

Mark Brackett 2010-03-13 02:09:05

ansaurus

tags:

views:

answers:

Is there a standard pattern for scanning a job table executing some actions?

related questions