views:

297

answers:

2

First off; I am not necessarily looking for Delphi code, spit it out any way you want.

I've been searching around (especially here) and found a bit about people looking for ways to compare to directories (inclusive subdirs) though they were using byte-by-byte methods. Second off, I am not looking for a difftool, I am "just" looking for a way to find files which do not match and, just as important, files which are in one directory but not the other and vice versa.

To be more specific: I have one directory (the backup folder) which I constantly update using FindFirstChangeNotification. Though the first time I need to copy all files and I also need to check the backup directory against the original when the applications starts (in case something happened when the application wasn't running or FindFirstChangeNotification didn't catch a file change). To solve this I am thinking of creating a CRC list for the backed up files and then run through the original directory computing the CRC for every file and finally compare the two CRCs. Then somehow look for files which are in one directory and not the other (again; vice versa).

Here's the question: Is this the fastest way? If so, how would one (roughly) get the job done?

+3  A: 

You don't necessarily need CRCs for each file, you can just compare the "last modified" date for every file for most normal purposes. It's WAY faster. If you need additional safety, you can also compare the lengths. You get both of these metrics for free with the find functions.

And in your change notification, you should probably add the files to a queue and use a timer object to copy the new queued files every ~30sec or something, so you don't bog down the system with frequent updates/checks.

For additional speed, use the Win32 functions wherever possible, avoid any Delphi find/copy/getfileinfo functions. I'm not familiar with the Delphi framework but for example the C# stuff is WAY WAY WAY slower than the Win32 functions.

Blindy
No need to worry about the last point. Delphi's "framework" is native code that wraps the Windows API, and tends to be much faster than the .NET equivalent.
Mason Wheeler
Interesting, and even more interesting that it never occurred to me. How would comparing the length (file size) add safety? Isn't it already safe enough to assume that any file that has been edited will have its last modified date updated?Your second paragraph worries me a bit, because I was under that impression that notification handles only light up when the actual event has taken place.
Daniel-Dane
The last written date should be enough, yes. The length is there for paranoia (I keep thinking of an archived file being extracted with the old date, but a different size). And my second paragraph is so you don't continuously copy files while your program is running in the background. Presumably you're doing something else in the foreground and you don't want performance to degrade noticeably over extended periods of time. So instead you group the file copying together in short, possibly cached bursts of data and leave the cpu alone for the rest of the time.
Blindy
Uhm, I don't think I am following you. On the first run (when all the files have to be copied), I start a threaded recursive file copying. This shouldn't drain any resources. And when that is done, only updated files are copied. Would this really have such a giant impact?
Daniel-Dane
Only one way to find out, try it out. I'm just offering an alternative in case you find that it's noticeable.
Blindy
A: 

Regardless of you "not looking for a difftool", are you opposed to using Cygwin with it's "diff" command for the shell? If you are open to this its quite easy, particularly using diff with the -r "recursive" option.

The following generates the differences between 2 Rails installs on my machine, and greps out not only information about differences between files but also, specifically by grepping for 'Only', finds files in one directory, but not the other:

$ diff -r pgnindex pgnonrails | egrep '^Only|diff'
Only in pgnindex/app/controllers: openings_controller.rb
Only in pgnindex/app/helpers: openings_helper.rb
Only in pgnindex/app/views: openings
diff -r pgnindex/config/environment.rb pgnonrails/config/environment.rb
diff -r pgnindex/config/initializers/session_store.rb pgnonrails/config/initializers/session_store.rb
diff -r pgnindex/log/development.log pgnonrails/log/development.log
Only in pgnindex/test/functional: openings_controller_test.rb
Only in pgnindex/test/unit: helpers
George Jempty