views:

578

answers:

10

Hello,

I'm looking for a good strategy to truly decouple, for parallel processing, my web application's (ASP.NET MVC/C#) non-immediate processes. I define non-immediate as everything that doesn't require to be done right away to render a page or update information.

Those processes include sending email, updating some internal statistics based on database information, fetching outside information from web services which only needs to be done periodically and so forth.

Some communication needs to exist between the main ASP.NET MVC application and those background tasks though; e.g. the MVC application needs to inform the emailing process to send something out.

What is the best strategy to do this? MSMQ? Turn all those non-immediate processes into windows services? I'm imagining a truly decoupled scenario, but I don't want a trade off that makes troubleshooting/unit testing much harder or introduces vast amounts of code.

Thank you!

+2  A: 

Can't speak for ASP.NET as I work primarily in Python, but...luckily I can answer this one as it's more of a meta-language question.

I've typically done this with a queue-based backend daemon which runs independently. When you need to add something to the queue, you can IPC with a method of your choice (I'm partial to HTTP) and deliver a job. The daemon just knocks through the jobs one by one -- possibly delegating them to worker threads itself. You can bust out of the RESTful side of your application and fire off jobs to the backend, i.e.:

# In frontend (sorry for Python, should be clear)
...
backend_do_request("http://loadbalancer:7124/ipc", my_job)
...

# In backend (psuedoPython)
while 1:
   job = wait_for_request()
   myqueue.append(job)
...
def workerthread():
   job = myqueue.pop()
   do_job(job)

If you later need to check in with the background daemon and ask "is job 2025 done?" you can account for that in your design.

If you want to do that with a Windows Service I would imagine you can. All it needs to do is listen on a port of your choice for whatever IPC you want to do -- I'd stick with network transports, as local IPC will assume same-machine and limit your scalability. Your unit testing shouldn't be that much harder; you can just account for the frontend and the backend as two different projects.

Jed Smith
+1 nice explanation.
Alex
A: 

If you can develop for .NET 4 Framework then you can decouple by using F# or the Parallel Computing features (http://msdn.microsoft.com/en-us/library/dd460693(VS.100).aspx)

F# is designed to support parallel computing so it may be a better choice than moving code into services.

Though, if you wanted, you could just use WCF and off-load everything to webservices, but that may not really solve your problem as it just moves the issues elsewhere.

EDIT: Moving the non-essential to webservices may make the most sense then, and this is a standard practice where the webserver is outside of the firewall, so vulnerable, so all the real work is done by other servers, and the webserver is just responsible for static pages and rendering.

You can use Spring.NET for this, if you don't want to add webservices, but either way you are largely just calling a remote process to do the work.

This is scalable as you can separate the business logic to several different servers, and since the webserver is largely just the view part of MVC it can handle more requests than if all the MVC work is in the webserver.

Because it is designed for this, Spring.NET should be easier to test, but, webservices can also be tested, as you should test each part separately, then do functional tests, but, by using Spring.NET it is easier to mock out levels.

James Black
I suspect OP is talking about things not relevant to his current request being executed in the background, not taking advantage of multicore. I asked for clarification though, maybe I'm wrong.
Jed Smith
Correct, this is not about multicore but process separation by concern and immediacy.
Alex
A: 

We've done this with the workflow API, or if it's not imperative that it it execute you could use a simple delegate.BeginInvoke to run this on a background thread.

csharptest.net
+1  A: 

Simplest way to handle async processing in ASP.NET is to use the ThreadPool to create a worker that you hand your work off to. Be aware that if you have lots of small jobs you are trying to hand-off quickly, the default ThreadPool has some annoying lock contention issues. In that scenario, you either need to use C# 4.0's new Stealing ThreadPool, or you can use MindTouch's Dream library which has a Stealing Threadpool implementation (along with tons of other async helpers) and works with 3.5.

Arne Claassen
A: 

This is a pattern that I tend to think of as 'Offline Services', and I've usually implemented it as a Windows service that can run multiple tasks on their own schedules.

Each task implements a business process such as sending pending emails from a message queue or database table, writing queued log messages to an underlying provider, or performing some batch processing that needs to happen at regular intervals, such as archiving old data or importing data objects from incoming feeds.

The advantage of this approach is that you can build in full management capabilities into the task management service, such as tracing, impersonation, remote integration via WCF, and error handling and reporting, all while using your .NET language of choice to implement the tasks themselves.

There are a few scheduling APIs out there, such as Quartz.NET, that can be used as the starting point for this sort of system. In terms of multi-threading, my general approach is to run each task on its own worker thread, but to only allow one instance of a task to be running at a given time. If a task needs parallel execution then that is implemented in the task body as it will be entirely dependent on the work the task needs to do.

My view is that a web application should not be managing these sorts of tasks at all, as the web application's purpose is to handle requests from your users, not manage intermediate background jobs. It's a lot of work to build a system like this initially, but you'll be able to re-use it on virtually any project.

Sam
A: 

A windows service managing these tasks, using a ThreadPool, and communicating with it via an MSMQ is certainly my preferred approach. Nicely scalable as well, due to the public queue abilities.

Noon Silk
+1  A: 

ThreadPool in .NET is queue based worker pool, however its used internally by ASP.NET host process, so if you try to utilize ThreadPool more, you may reduce performance of Web Server.

So you must create your own thread, mark it as background and let it poll every few seconds for job availability.

The best way to do is, create a Job Table in database as follow,

Table: JobQueue
JobID (bigint, auto number)
JobType (sendemail,calcstats)
JobParams (text)
IsRunning (true/false)
IsOver (true/false)
LastError (text)

JobThread class could be like following.

class JobThread{
    static Thread bgThread = null;
    static AutoResetEvent arWait = new AutoResetEvent(false);

    public static void ProcessQueue(Job job)
    {
         // insert job in database
         job.InsertInDB();

         // start queue if its not created or if its in wait
         if(bgThread==null){
              bgThread = new Thread(new ..(WorkerProcess));
              bgThread.IsBackground = true;
              bgThread.Start();
         }
         else{
              arWait.Set();
         }
    }

    private static void WorkerProcess(object state){
         while(true){
              Job job = GetAvailableJob( 
                        IsProcessing = false and IsOver = flase);
              if(job == null){
                   arWait.WaitOne(10*1000);// wait ten seconds.
                                           // to increase performance
                                           // increase wait time
                   continue;
              }
              job.IsRunning = true;
              job.UpdateDB();
              try{

              //
              //depending upon job type do something...
              }
              catch(Exception ex){
                   job.LastError = ex.ToString(); // important step
                   // this will update your error in JobTable
                   // for later investigation
                   job.UpdateDB();
              }
              job.IsRunning = false;
              job.IsOver = true;
              job.UpdateDB();
         }
    }
}

Note This implementation is not recommended for high memory usage tasks, ASP.NET will give lots of memory unavailability errors for big tasks, like for example, we had lot of image uploads and we needed to create thumbnails and process them using Bitmap objects, ASP.NET just wont allow you to use more memory so we had to create windows service of same type.

By creating Windows service you can create same thread queue and utilize more memory easily, and to make communication between ASP.NET and Windows Service you can use WCF or Mutex objects.

MSMQ MSMQ is also great, but it increases configuration tasks and it becomes difficult to trace errors sometimes. We avoid MSMQ because lot of time we spend looking for an answer of problem in our code where else MSMQ configuration is problem and the errors sometimes dont give enough information of where exactly is the problem. In our custom solution we can create full debugger version with logs to trace errors. And thats biggest advantage of Managed Programs, in earlier Win32 apps, the errors were really difficult to trace.

Akash Kava
+1  A: 

Nservicebus sounds like it might be applicable here, though under the covers it'd probably use msmq. Essentially you sound like you're after doing asynchronous stuff, which .net has good mechanisms for dealing with.

danswain
A: 

MSMQ is an awesome way to do this. A web farm can feed requests into one or more queues. The queues can be serviced by one or more processes on one or more servers giving you scale and dedundancy. (Run MSMQ on a cluster if you want to remove the single point of failure). We did this about 8-9 years back and it was awesome watching it all run :) And even back then MSMQ was dead simple to use (from COM) -- I have to imagine things have only gotten better with .NET.

DougN
A: 

Following sound software engineering principles will keep your unit testing complexity to a minimum. Follow SRP (Single Responsibility Principle). This is especially the case for multi-threaded code which sounds like where you're headed. Robert Martin addresses this in his book "Clean Code".

To answer your question, there are, as you've seen from the array of posts, many ways to solve background processing. MSMQ is a great way to communicate with background processes and is also a great mechanism for addressing reliability (eg., request 5 emails sent, expect 5 emails sent).

A really simple and effective way to run a backround process in asp.net is using a background worker. You need to understand if the background worker (a thread) runs in the application's domain or inetinfo. If it's in the app domain, then the trade-off is you'll lose the thread when the app pool recycles. If you need it durable, then it should be carved out into its own process (eg., Windows Service). If you look into WCF, Microsoft addresses WS-Reliability using MSMQ. Better news is you can host WCF services in a Windows Service. One-way calls to the service suffice to eliminate blocking on the web server which effectively gives you background process.

James Black mentions using Spring.NET. I agree with his recommendation for 2 reasons: 1) Because Spring.NET's support for services and web are superior to other frameworks and 2) Spring.NET forces you to decouple which also simplifies testing.

Back on track: 1: Background worker - tradeoff is it's closely tied to the app pool/app domain and you're not separating effectively. Good for simple one-off type jobs (image resizing, etc). In-memory queues are volatile which can mean loss of data. 2: Windows Service - tradeoff is deployment complexity (although I'll argue this is minimal). If you will have families of low-resource-utilized background processes, opt for pluggability and host all in one Windows Service. Use durable storage (MSMQ, DB, FILE) for job requests and plan for recovery in your design. If you have 100 requests in queue and the Windows service restarts, it should be written so it immediately checks the queue for work. 3: WCF hosted in IIS - about the same complexity as (2) as I would expect the Windows Service to host WCF and that would be the communication mechanism between ASP.NET and the service. I don't personally like the "dump and run" design (where asp.net writes to a queue) because it reduces clarity and you're ultimately tightly coupling to msmq.