views:

279

answers:

3

We have an ASP.Net application that provides administrators to work with and perform operations on large sets of records. For example, we have a "Polish Data" task that an administrator can perform to clean up data for a record (e.g. reformat phone numbers, social security numbers, etc.) When performed on a small number of records, the task completes relatively quickly. However, when a user performs the task on a larger set of records, the task may take several minutes or longer to complete. So, we want to implement these kinds of tasks using some kind of asynchronous pattern. For example, we want to be able to launch the task, and then use AJAX polling to provide a progress bar and status information.

I have been looking into using the BackgroundWorker class, but I have read some things online that make me pause. I would love to get some additional advice on this.

For example, I understand that the BackgroundWorker will actually use the thread pool from the current application. In my case, the application is an ASP.Net web site. I have read that this can be a problem because when the application recycles, the background workers will be terminated. Some of the jobs I mentioned above may take 3 minutes, but others may take a few hours.

Also, we may have several hundred administrators all performing similar operations during the day. Will the ASP.Net application thread pool be able to handle all of these background jobs efficiently while still performing it's normal request processing?

So, I am trying to determine if using the BackgroundWorker class and approach is right for our needs. Should I be looking at an alternative approach?

Thanks and sorry for such a long post!

Kevin

A: 

You might consider a slightly different approach.

For example, have a command and control table in which you send commands like "REFORMAT PHONE NUMBERS" or whatever.

Then have a windows service monitoring that table. Whenever a record shows up, run the command.

This eliminates any sort of worry about a background thread. Further you have a bit more flexibility with regards to what's in the queue, order of operations including priority, etc. Finally, you would have a definitive list of what is running or needs to run.

As an option, instead of a windows service you might just use a SQL job to execute every so often to watch your control table and perform the requested action.

Chris Lively
A: 

Based on what you are saying I think that BackgroundWorker is not a good choice.

Furthermore keeping this functionality as a part of your main app can be problematic, specifically because you do not want the submitted processing to be interrupted if the main app recycles. You can play with asynch processing but it still will be a part of the main app AppDomain - all of it will die if the app recycles.

I would suggest buidling a separate app implementing this functionality. In a similar situation I separated background processing to a Windows service and hosted a web service in it as a means of communication

mfeingold
A: 

In your case it actually sounds like the solution you will be looking for is multifaceted (and not a simple in and done project).

Since you said that some processes can last for hours that is absolutely not something for ASP.NET to own. This should be ran inside a windows service and managed with native windows threading.

You will need to implement some type of work queue in your service and a way to communicate with the queue. One way is to expose a WCF service for all actions your service will govern. Another would be to have service poll a database table and pick up work from the table.

To be able express the status of the process you will want the ASP.NET application to be able to have some reference to the processID for example the WCF service returns a guid identifier. Then you have a method that when you give it the processID it will return the status of the process. You can then implement the polling of that service call using AJAX and display any type of modal you wish.

Another thing to remember is that you need to design your processes to have knowledge of where it is and where it will be when it is finished so it can track the state it's in. For example, BatchJobA is run and will have 1000 records to process. The service needs to know what record it's on or what the current % of competition is for it to be able to return information to the UI. For sql queries that take a very long time to execute this can be very problematic to accurately gauge where it is unless you do alot of pre and post processing of temp tables that you can in the middle of it read the status of the temp tables to understand where it is.

Chris Marisic