views:

43

answers:

1

I'm working on an ASP.NET MVC application that uses the Google Maps Geocoding API. In a single batch there may be upto 1000 queries to submit to the Geocoding API, so I'm trying to use a parallel processing approach to imporove performance. The method responsible for starting a process for each core is:

public void GeoCode(Queue<Job> qJobs, bool bolKeepTrying, bool bolSpellCheck, Action<Job, bool, bool> aWorker)
    {
        // Get the number of processors, initialize the number of remaining   
        // threads, and set the starting point for the iteration. 
        int intCoreCount = Environment.ProcessorCount;
        int intRemainingWorkItems = intCoreCount;

        using(ManualResetEvent mreController = new ManualResetEvent(false))
        {
            // Create each of the work items. 
            for(int i = 0; i < intCoreCount; i++)
            {
                ThreadPool.QueueUserWorkItem(delegate
                {
                    Job jCurrent = null;

                    while(qJobs.Count > 0)
                    {
                        lock(qJobs)
                        {
                            if(qJobs.Count > 0)
                            {
                                jCurrent = qJobs.Dequeue();
                            }
                            else
                            {
                                if(jCurrent != null)
                                {
                                    jCurrent = null;
                                }
                            }
                        }

                        aWorker(jCurrent, bolKeepTrying, bolSpellCheck);
                    }

                    if(Interlocked.Decrement(ref intRemainingWorkItems) == 0)
                    {
                        mreController.Set();
                    }
                });
            }

            // Wait for all threads to complete. 
            mreController.WaitOne();
        }
    }

This is based on patterns document I found on Microsoft's parallel computing web site. The problem is that the Google API has a limit of 10 QPS (enterprise customer) - which I'm hitting - then I get HTTP 403 error's. Is this a way I can benefit from parallel processing but limit the requests I'm making? I've tried using Thread.Sleep but it doesn't solve the problem. Any help or suggestions would be very much appreciated.

+1  A: 

It sounds like your missing some sort of Max in Flight parameter. Rather than just looping while there are jobs in the queue, you need to throttle your submissions based on jobs finishing.

Seems like your algorithm should be something like the following:

submit N jobs (where N is your max in flight)

Wait for a job to complete, and if queue is not empty, submit next job.  
Brett McCann
thanks for the response - this has got me thinking about the problem differently.
markpirvine