views:

32

answers:

1

Hi,

I'm trying to implement a multi threaded, recursive file search logic in Visual C++. The logic is as follows: Threads 1,2 will start at a directory location and match the files present in the directory with the search criteria. If they find a child directory, they will add it to a work Queue. Once a thread finishes with the files in a directory, it grabs another directory path from the work queue. The work queue is a STL Stack class guarded with CriticalSections for push(),pop(),top() calls.

If the stack is empty at any point, the threads will wait for a minute amount of time before retrying. Also when all the threads are in waiting state, the search is marked as complete.

This logic works without any problems but I feel that I'm not gaining the full potential of using threads because there isn't drastic performance gain compared to using single thread. I feel the work Stack is the bottle neck but can't figure out how to do away with the locking part. I tried another variation where each thread will be having its own Stack and will add a work item to the global Stack only when the local stack size crosses a fixed number of work items. If the local Stack is empty, threads will try fetching from global queue. I didn't find noticeable difference even with this variation. Does any one have any suggestions for improving the synchronization logic.

Regards,

+2  A: 

I really doubt that your work stack is the bottleneck. The disk only has one head, and can only read one stream of data at a time. As long as your threads are processing the data as fast as the disk can supply it, there's not much else you can do that's going to have any significant effect on overall speed.

For other types of tasks your queue might become a significant bottleneck, but for this task, I doubt it. Keep in mind the time scales of the operations here. A simple operation that happens inside of a CPU takes considerably less than a nanosecond. A read from main memory takes on the order of tens of nanoseconds. Something like a thread switch or synchronization takes on the order of a couple hundred nanoseconds or so. A single head movement on the disk drive takes on the order of a millisecond or so (1,000,000 nanoseconds).

Jerry Coffin
Thanks Jerry. Is there a way to check if the disk reads are already being used to their maximum on a windows machine.
ivymike
@ivymike: You can use the performance monitor to see how much disk bandwidth you're using, though that won't tell you the theoretical maximum bandwidth. If memory serves, you can also check the disk queue depth to get an idea of whether there are I/O commands waiting to execute. You don't care a whole lot about the exact queue depth, only that it's (almost) almost non-zero (here I'm talking about the OS's queue, not yours).
Jerry Coffin