views:

42

answers:

1

I have a SQL Server table with a list of files (path + filename), and a folder with multiple layers and files in each layer. I'm looking for a way to reconcile the two without having to process the list twice. Currently, I'm doing this:

For Each f as FileInfo In FileListFromDatabase
   If f.Exists is False, mark it as deleted in the database
Next

For Each f as FileInfo In RecursiveListOFFilesOnDisk
   If Not FileExistsInDatabase, then add it
Next

Is there a better way to do this? I'd like to avoid converting all the matching files (of which most will be) to FileInfo objects twice. Since I'm a T-SQL developer first, I'm picturing something like an OUTER JOIN of the two lists where they don't match. Something LINQ-ish?

+1  A: 

I can't think of any solution as neat as you want, but if you're worrying about IO performance, instead of finding each existing file twice, just read in all the file locations and do it all in memory.

Dim dbFiles As List(Of String)
Dim files As New List(Of String)
files.AddRange(IO.Directory.GetFiles("path", "pattern", IO.SearchOption.AllDirectories))
Dim deletedFiles As New List(Of String)
For Each dbfile As String In dbFiles
    If files.Contains(dbfile) Then
        files.Remove(dbfile)
    Else
        deletedFiles.Add(dbfile)
    End If
Next
' now deleteFiles should have all files that have been deleted from the file 
' system and files should have all new files in the file system

If the path names stored in the DB could have a different case than the ones in the filesystem you might have to adjust for that though.

ho1
I'll probably go this route - I'm not concerned about loading the file list into memory, just about creating the FileInfo object twice, but this looks like it might do the trick. Thanks!
rwmnau