tags:

views:

173

answers:

4

I have a DB that contains a list of paths to files. I want to build a routine to cleanup the folders, removing files in the directories if there is not a db record for it (for temp ajax file uploads, in cases where the user doesn't complete the form, etc...).

I'm thinking something like this:

var dbFiles = db.allPaths();
var allFiles = Directory.EnumerateFiles(path);

foreach (var f in allFiles) {
  if (!dbFiles.Contains(f) {
    File.Delete(f);
  }
}

Any "Gotchas" waiting for me? The routine will be set to run once a week at first, more often if temp files become a problem. It will be run during a time when there are nearly no users on, so performance - while important - is not paramount.

Thanks in advance!

UPDATE

Wow, lots of great answers. This bit of code is turning into something "share" worthy. ;D My code above was just a simple, quick placeholder bit... but it's transformed into solid code. Thank you!

+3  A: 

Looks good to me; however I've never deleted files within C#, just VB. However, you might want to throw that into a Try/Catch loop, as if the file isn't able to be deleted (read-only, currently in use, no longer exists, etc.), it will throw an exception.

EDIT: How are the paths stored? Remember, in C# you need to escape out paths "//" instead of using "\" IIRC.

EDIT 2: Scratch that last edit out lol.

Jeffrey Kern
Escaping is only needed for literals, not when reading strings from somewhere. Note that escaping concerns the C# language, not the internal representation of a string.
chiccodoro
I don't even know what `escape out paths "//"` would even mean.
Gabe
Ah. Whenever I've programmed with C# I've just escaped out the paths. Probably since I've hardcoded the paths - thank you for the clarification.
Jeffrey Kern
Evan Plaice
@Gabe, if you hardcode a path in C#, you would need to use // instead of \ for file paths. E.g., String foo = "c://foo.txt" instead of "c:\foo.txt"
Jeffrey Kern
Jeffrey: It turns out that .Net will accept `//`, but you're supposed to use `@"c:\foo.txt"` or `"c:\\foo.txt"`.
Gabe
@Gabe: I wouldn't say "supposed to" - I regularly use / (not //) in C# because it's more portable; it works on Windows *and* Unix. It also means you don't need to worry about escaping.
Jon Skeet
Jon: When is hard coding paths portable? If you are creating paths in code, you should be using `Path.Combine` or at least using `Path.DirectorySeparatorChar`. There's certainly never any reason to use `//`.
Gabe
@Gabe: Note I was using "/" rather than "//", but if you're looking for files relative to the current directory (or some installation directory) I think it's reasonable to use "/". `Path.Combine` is *generally* preferable, but I can't think of any system on which "a/b.txt" would fail. Can you?
Jon Skeet
@Gabe, I'm confused when it comes to working with file paths it seems. Thank you for the clarification.
Jeffrey Kern
+1 for mentioning exception handling
Marnix van Valen
Jon: I'll grant you that your C# app may not be likely to run on such OSes, but MacOS 9 uses `:` (/ is a perfectly valid filename char), VMS uses `.`, and you can imagine various IBM mainframe-type systems would have their own way of expressing paths.
Gabe
+7  A: 

Looks okay, but you can make it simpler:

foreach (var file in allFiles.Except(dbFiles))
{
    File.Delete(file);
}

You've got to make sure that the paths are in exactly the same format though. If one list has relative files and the other has absolute files, or if one uses "/" and the other uses "\" you'll end up deleting things you don't expect to.

Ideally you'd canonicalise the files explicitly first, but I can't see a nice way of getting a canonical file name in .NET...

EDIT: Note that Path.GetFullPath does not canonicalize. It fixes slashes and makes it absolute, but it doesn't address case: "c:/users" becomes "c:\users", but "c:/Users" becomes "c:\Users".

This could be fixed by using a string comparer in the call to Except:

var dbFiles = db.AllPaths().Select(Path.GetFullPath));
var allFiles = Directory.EnumerateFiles(path).Select(Path.GetFullPath));

foreach (var file in allFiles.Except(dbFiles, StringComparer.OrdinalIgnoreCase))
{
    File.Delete(file);
}

Now that's ignoring case - but in an "ordinal" manner. I don't know what the Windows file system really does in terms of its case sensitivity.

Jon Skeet
You would want to use `Path.GetFullPath(p)` to canonicalize the paths. Maybe `allFiles.Select(p => Path.GetFullPath(p)).Except(dbFiles.Select(p => Path.GetFullPath(p)))` would do what you're looking for.
Gabe
@Jon That's nice... love making things simpler.
Chad
@Gabe Man, I totally forgot about LINQ in this scenario. Very nice.
Chad
Chad: I combined Jon's answer, my comment, and Jeffrey's answer all into one in my answer.
Gabe
@Gabe: Path.GetFullPath doesn't really canonicalize. See my edit. Oh, and you can use a method group conversion to make the Select method slightly simpler :)
Jon Skeet
Good point, when I was testing the GetFullPath it turned out I had come up with the right case. Getting the right case is tricky (requires hitting the disk), so using OrdinalIgnoreCase is probably the right thing to do here (or make sure `db.AllPaths` already has the right case).
Gabe
+1  A: 

I think it's alright in spirit, though it would be closer to:

List<string> dbFiles = db.allPaths();
string[] allFiles = Directory.GetFiles(path);

foreach (string f in allFiles)
    if (!dbFiles.Contains(f))
        File.Delete(f);
Reinderien
Directory.EnumerateFiles is fine - it's part of .NET 4. Other than that, as far as I can see you've only replaced "var" with explicit typing, and removed braces.
Jon Skeet
That would explain it...
Reinderien
+1  A: 

To combine all the suggestions into one:

// canonicalize paths
var dbFiles = db.allPaths().Select(Path.GetFullPath);
var allFiles = Directory.EnumerateFiles(Path.GetFullPath(path))

foreach (var file in allFiles.Except(dbFiles, StringComparer.OrdinalIgnoreCase))
{
    try {
        File.Delete(file);
    } catch (IOException) {
        // handle exception here
    }
}
Gabe
Very nice, thank you Gabe.
Chad
Now with Jon's method group improvements.
Gabe