views:

121

answers:

1

I am working on a process at the moment which iterates through files in a folder; its used as part of a data migration. Each iteration of a file takes about 8 seconds; I need to run it about 30000 times so 8s that is a bit of a problem. The actions in the function cant be multithreaded as it pulls apart the document and that needs to be serial.

The single threaded code does this

    For Each strFile In System.IO.Directory.GetFiles(txtFolderPathIN.Text)
        CallFunction(strFile)
    Next

What is the best approach to convert this to make it multithreaded? There is no feedback to the user; just need to start the process and iterate through them as quickly as possible. What the easiest way to making this multithreaded?

+3  A: 

I would use a ThreadPool. Something like this:

ThreadPool.SetMaxThreads = 4

For Each strFile In System.IO.Directory.GetFiles(txtFolderPathIN.Text)
    ThreadPool.QueueUserWorkItem(new WaitCallback(addressof CallFunction), strFile)
Next

You could also use your own thread pooling system by having a list of threads and using a mutex to stop the main thread from exiting before all the child threads have finished running.

Pondidum
So Simple :) Was expecting something so much more complicated. Thanks :)
u07ch
This is quite clever and one I would myself consider, but it introduces the problem of each thread processing the same file multiple times, unless you keep track of the files processed, which means thread-safe lists. Great idea though!
Wez
FYI I did a test and seems like ArrayList.Synchronized(your_list) will return a thread-safe array that should work between the threads, for keeping track of which files you have already processed.
Wez