views:

136

answers:

3

I have a .net windows service that works it's way through a queue of items and does some research on each one. I'm trying to change it to a threaded model so that batches items can be researched concurrently rather than doing each item sequentially.

I know this isn't the perfect solution (it currently has to wait for the slowest item to finish being researched before moving on to the next batch).

  1. With the example below have I got the right idea with this when it comes to the threading?
  2. Admittedly I'm running this on a VM so it may be hampering the performance however I was expecting a bit more of a speed improvement. Currently I'm seeing about 10% improvement and was hoping that by researching 5 or more side by side it would be lot faster than that (i.e. preferably 1/5 time but surely I should expect at least 50%?). Besides the limitation with waiting on the slowest item to finish researching have I limited this by the way I've done the locking or something?

    static object locker = new Object();
    private List<string> currentItems = new List<string>();
    
    
    private void researcherTimer_Elapsed(object sender, System.Timers.ElapsedEventArgs e)
    {
        if(Monitor.TryEnter(locker))
        {
            try
            {
                if (currentItems.Count == 0)
                {
                    // get the next x items from the db and adds them to the currentItems list
                    SetNextItems(); 
                    if (currentItems.Count > 0)
                    {
                        foreach (string item in currentItems)
                        {
                            ThreadPool.QueueUserWorkItem(ResearchInThread, item);
                        }
                    }
                }
            }
            finally
            {
                Monitor.Exit(locker);
            }
        }
    }
    
    
    void ResearchInThread(object item)
    {
        string currentItem = (string)item;
        try
        {
            // Research Process Here
        }
        finally
        {
            // Remove this item from the current list
            lock (locker)
            {
                currentItems.Remove(currentItem);
            }
        }
    }
    
A: 

A possible performance enhancement would be to use the System.Threading.Timer and the TimerCallback delegate instead as System.Timer can be affected by the system performance i.e. if the system is busy the Timer may not always trigger on time.

James
I'll take a look at that see if makes a difference...
Luke
A: 

You have left out the main part of the code. How the timer function gets called "researcherTimer_Elapsed".

Under which conditions does the time elapse.

Can you please add some more code to the example to show the conditons under which the function could be called.

scptre
I would imagine since the method is "Elapsed" he is using a System.Timer object which has an interval.
James
It's just a System.Timer set to poll once a second.
Luke
+1  A: 

The performance gain depends much on your hardware (do you have multi core?) and on the algorithm that you are executing in parallel.

If the algorithm is close to 100% CPU bound (e.g. only doing raw calculations on data in memory) and you are running on a single core machine then, sadly, you won't see much improvement by running it in multiple threads.

On the other hand, such an algorithm would gain almost linear performance increase if you run on multiple cores.

If your algorithm is both I/O and CPU then you should see a decent performance gain by threading it even on a single core machine. How much gain depends on so many factors that I won't even put a number here :)

Regarding your code, I would recommend you create one big queue of "jobs" (your item object) and then spawn a certain number of threads that pulls jobs from this queue until it's empty. The threads would run until they see that the queue is empty. This way, you won't have to wait for the slowest worker before releasing the next batch. This should help with the overall performance gain.

Just make sure the queue is protected by a lock mechanism when dequeueing so that you don't run into race conditions.

Isak Savo
Just to clarify the sequential process varies between 10-20% CPU, there is a fair bit of db access going on too.
Luke