views:

216

answers:

3

I want to search a file duplicate by its hash. For performance purposes I want to know if there is a stored hash/checksum for each file in NTFS/FAT file systems. If there is, I don't have to compute them all to search my file.

If there is, how to access it using .NET?

If it helps, it will be JPEG files. Do they have a checksum?

+5  A: 

There is no such thing.

Andrew Medico
Windows allows random writes to a file. Could you imagine the overhead if each write required recomputing the file's checksum?
Mark Ransom
I imagine that at least EXE files have a checksum, as other types may have.
Jader Dias
Andrew is correct.
Foredecker
+3  A: 

Windows does not store a hash for each file. As Jader Dias suggests, there are checksums for EXE's and DLL's but these are not the droids you are looking for.

Note that even if you had such a hash, it still does not guarantee uniqueness. If you found two files with the same hash (and size) you would still have to then compare contents to determine if the files were truly the same.

JPEG files may have some checksums or hashes, but you probably cannot count on them either.

Foredecker
+1 for "Note that even if you had such a hash, it still does not guarantee uniqueness." ... although it's true that very small changes *almost always* result in a unique hash, users have a way of producing those magical edge-case conditions.
overslacked
A: 

Windows though does have search now & if I recall correctly you can write your own plugins for it (in other words, to index files in a custom way). Presumably, you could write a plugin for JPGs & then simply make search API calls to find files (after Windows does the indexing).

Vitali
I think Windows indexes text (as filenames), not images.
Jader Dias
From <a href="http://msdn.microsoft.com/en-us/library/aa965362%28VS.85%29.aspx">MSDN</a>:The content indexed is based on the file and data types supported through add-ins... filters included in Window Search support over 200 common types of data including ... plain-text files, HTML, and many more.Sure, while it only natively supports certain files, as it says, you can index anything with a custom plugin.Certainly search can index MP3s - JPGs would be no different.
Vitali