I am working on a multi-threaded application.
This application started out as a single thread and was expanded to multiple threads in order to realize a gain in performance.
I have a main thread which divides up the work into smaller chunks and offloads it to worker threads which process the chunks. This portion is controlled using a semaphore to allow only X number of worker threads at any one time. The worker threads produce chunks of data which are then stored in a queue or ring buffer which is then read by one saving thread. This thread is responsible for saving the chunks of data to the disk (sometimes across the local network).
My development machine is a Quad Core with 8GB of RAM. Running the application on my machine with 3 worker threads and 1 saver thread results in a steady flow of data over the network with the processors being utilized to an average 75%.
The second method of attacking this problem is where I add another set of threads between the worker threads and the saver thread (i.e. taking one task out of the current worker thread and add it to another thread) (I also add a queue for each of these threads) the application does not seem to gain any speed on my machine as there seems to be too much contention for resources RAM bus saturation and processor contention.
Through much experimentation with the number of threads and their priorities, I have found the ideal settings for my machine, for both the first and second methods of approaching this problem. Now the production machine will have 8 cores and 64GB of RAM. A much different environment and the application will have to be configured for it.
My question is, At what point have you created too many threads? Is it always a matter of experimenting to determine the ideal settings for a given machine? Is there a method of determing or observing if locking is taking too much away from the application?
(I'm not using a thread-pool because it does not fit my needs with long running threads being managed by semaphores and other locking mechanisms.)