Hi,
I've written an application in C# that moves jpgs from one set of directories to another set of directories concurrently (one thread per fixed subdirectory). The code looks something like this:
string destination = "";
DirectoryInfo dir = new DirectoryInfo("");
DirectoryInfo subDirs = dir.GetDirectories();
foreach (DirectoryInfo d in subDirs)
{
FileInfo[] files = subDirs.GetFiles();
foreach (FileInfo f in files)
{
f.MoveTo(destination);
}
}
However, the performance of the application is horrendous - tons of page faults/sec. The number of files in each subdirectory can get quite large, so I think a big performance penalty comes from a context switch, to where it can't keep all the different file arrays in RAM at the same time, such that it's going to disk nearly every time.
There's a two different solutions that I can think of. The first is rewriting this in C or C++, and the second is to use multiple processes instead of multithreading.
Edit: The files are named based on a time stamp, and the directory they are moved to are based on that name. So the directories they are moved to would correspond to the hour it was created; 3-27-2009/10 for instance.
We are creating a background worker per directory for threading.
Any suggestions?