views:

217

answers:

4

I am working on a project with peek performance requirements, so we need to bulk (batch?) several operations (for example persisting the data to a database) for efficiency.

However, I want our code to maintain an easy to understand flow, like:

input = Read();
parsed = Parse(input);
if (parsed.Count > 10)
{
   status = Persist(parsed);
   ReportSuccess(status);
   return;
}
ReportFailure();

The feature I'm looking for here is automatically have Persist() happen in bulks (and ergo asynchronously), but behave to its user as if it's synchronous (user should block until the bulk action completes). I want the implementor to be able to implement Persist(ICollection).

I looked into flow-based programming, with which I am not highly familiar. I saw one library for fbp in C# here, and played a bit with Microsoft's Workflow Foundation, but my impression is that both are overkill for what I need. What would you use to implement a bulked flow behavior?

Note that I would like to get code that is exactly like what I wrote (simple to understand & debug), so solutions that involve yield or configuration in order to connect flows to one another are inadequate for my purpose. Also, chaining is not what I'm looking for - I don't want to first build a chain and then run it, I want code that looks as if it is a simple flow ("Do A, Do B, if C then do D").

A: 

I don't know if this is what you need, because it's sqlserver based, but have you tried taking a look to SSIS and or DTS?

Sklivvz
Not really. Assume I have some basic functionality of PersistBulk() that accepts a bulk of items and persists them (regardless of low level implementation). I want to expose a synchronous PersistSingleItem() method.Your answer is not really relevant, thanks.
ripper234
+1  A: 

Common problem - instead of calling Persist I usually load up commands (or smt along those lines) into a Persistor class then after the loop is finished I call Persistor.Persist to persist the batch.

Just a few pointers - If you're generating sql the commands you add to the persistor can represent your queries somehow (with built-in objects, custom objects or just query strings). If you're calling stored procedures you can use the commands to append stuff to a piece of xml tha will be passed down to the SP when you call the persist method.

hope it helps - Pretty sure there's a pattern for this but dunno the name :)

JohnIdol
A: 

One simple thing that you can do is to create a MemoryBuffer where you push the messages which simply add them to a list and returns. This MemoryBuffer has a System.Timers.Timer which gets invoked periodically and do the "actual" updates.

One such implementation can be found in a Syslog Server (C#) at http://www.fantail.net.nz/wordpress/?p=5 in which the syslog messages gets logged to a SQL Server periodically in a batch.

This approach might not be good if the info being pushed to database is important, as if something goes wrong, you will lose the messages in MemoryBuffer.

Khurram Aziz
A: 

How about using the BackgroundWorker class to persist each item asynchronously on a separate thread? For example:

using System;
using System.Collections;
using System.Collections.Generic;
using System.ComponentModel;
using System.Threading;

class PersistenceManager
{
   public void Persist(ICollection persistable)
   {
      // initialize a list of background workers
      var backgroundWorkers = new List<BackgroundWorker>();

      // launch each persistable item in a background worker on a separate thread
      foreach (var persistableItem in persistable)
      {
         var worker = new BackgroundWorker();
         worker.DoWork += new DoWorkEventHandler(worker_DoWork);
         backgroundWorkers.Add(worker);
         worker.RunWorkerAsync(persistableItem);
      }

      // wait for all the workers to finish
      while (true)
      {
         // sleep a little bit to give the workers a chance to finish
         Thread.Sleep(100);

         // continue looping until all workers are done processing
         if (backgroundWorkers.Exists(w => w.IsBusy)) continue;

         break;
      }

      // dispose all the workers
      foreach (var w in backgroundWorkers) w.Dispose();
   }

   void worker_DoWork(object sender, DoWorkEventArgs e)
   {
      var persistableItem = e.Argument;
      // TODO: add logic here to save the persistableItem to the database
   }
}
Robin
Don't you need to save each "worker" in backgroundWorkers?
beach
Oops! Yes, of course. Sorry I missed that. That line of code was in my head, but never made it into the post. I'll fix it now. Good catch, thanks.
Robin