views:

37

answers:

2

I have a set of subfolders 3 levels deep over with over 20k files in c:\MyData.

There is an almost identical set of subfolders on my E drive at e:\projects\massdata

I want to check in C and anything that already exists in E (same folder name, same file name, same size), I want to delete from C.

What is my best way of traversing this folder structure?

+2  A: 

Recursively go thru all files in each directory.

Create a string describing the relative path,file size, etc. of the files in E in a hashMap. Then just check if a specific files relative path exists, when going thru C, and delete it if so.

The string could for instance be [FILENAME]##[FILESIZE]##[LASTEDITER].

Here is one way to search recursively in C#: http://support.microsoft.com/kb/303974

Marcus Johansson
+3  A: 

how about using the join operator. joining on filename like this

public void cleanUp()
    {
        var cFiles = Directory.GetFiles(@"c:\MyData","*.*",SearchOption.AllDirectories);
        var fFiles = Directory.GetFiles(@"e:\projects\massdata","*.*",SearchOption.AllDirectories);
        Func<string, string, Tuple<string, long>> keySelector = (path, root) =>
            new Tuple<string, long>(path.Replace(root, ""), new FileInfo(path).Length);

        foreach (var file in cFiles.Join(fFiles, f => keySelector(f,@"e:\projects\massdata"), c => keySelector(c,@"c:\MyData"), (c, f) => c))
        {
            File.Delete(file);
        }
    }

Second Edit after update: The key selector should now meet your requirement. If I've misunderstood them. It sure be rather easy so see what you need to change. If not drop a comment :)

Rune FS