views:

881

answers:

4

I'm building my first web application after many years of desktop application development (I'm using Django/Python but maybe this is a completely generic question, I'm not sure). So please beware - this may be an ultra-newbie question...

One of my user processes involves heavy processing in the server (i.e. user inputs something, server needs ~10 minutes to process it). On a desktop application, what I would do it throw the user input into a queue protected by a mutex, and have a dedicated background thread running in low priority blocking on the queue using that mutex.

However in the web application everything seems to be oriented towards synchronization with the HTTP requests.

Assuming I will use the database as my queue, what is best practice architecture for running a background process?

+1  A: 

Speaking generally, I'd look at running background processes on a different server, especially if your web server has any kind of load.

Willie Wheeler
+3  A: 

There are two schools of thought on this (at least).

  1. Throw the work on a queue and have something else outside your web-stack handle it.

  2. Throw the work on a queue and have something else in your web-stack handle it.

In either case, you create work units in a queue somewhere (e.g. a database table) and let some process take care of them.

I typically work with number 1 where I have a dedicated windows service that takes care of these things. You could also do this with SQL jobs or something similar.

The advantage to item 2 is that you can more easily keep all your code in one place--in the web tier. You'd still need something that triggers the execution (e.g. loading the web page that processes work units with a sufficiently high timeout), but that could be easily accomplished with various mechanisms.

Michael Haren
+1  A: 

Since:

1) This is a common problem,

2) You're new to your platform

-- I suggest that you look in the contributed libraries for your platform to find a solution to handle the task. In addition to queuing and processing the jobs, you'll also want to consider:

1) status communications between the worker and the web-stack. This will enable web pages that show the percentage complete number for the job, assure the human that the job is progressing, etc.

2) How to ensure that the worker process does not die.

3) If a job has an error, will the worker process automatically retry it periodically? Will you or an operations person be notified if a job fails?

4) As the number of jobs increase, can additional workers be added to gain parallelism? Or, even better, can workers be added on other servers?

If you can't find a good solution in Django/Python, you can also consider porting a solution from another platform to yours. I use delayed_job for Ruby on Rails. The worker process is managed by runit.

Regards,

Larry

Larry K
A: 

http://iraniweb.com/blog/?p=56

anon
you should have add litte note about the link to the post.
Mohamed