views:

45

answers:

3

While l was looking over some questions about MEF, I stumbled onto this particular answer to a question. It made me wonder about such a bit since I've never had to attempt such but can see it being very valid in the scenario of that question.

Scenario: If you have a directory of various .Net Assemblies all named different, how would you be able to identify ones that may be the same but renamed (i.e. Copy Of MyAssembly.dll vs MyAssembly.dll)?

I can think of the following items:

  1. Check File Size (should be the same)

  2. Check Assembly Version Number

  3. Loop through the assembly using Reflection and attempt to locate any differences.

Is there any other/easier way of addressing this issue? Are there other criteria to look at for determining if 2 differently named DLLs are in fact the same compiled Assembly?

+1  A: 

I'll go first for a simple quick check using point 1. and 2., that is checking file size and assembly version number. If they're all different well, you're done.

If not, keep the files who have the same file size / version and compute their MD5/SHA1/whatever-you-prefer hash. If the hash is the same, you're definitely in presence of the same assembly twice. Since assemblies generally aren't very large (at most a few megabytes), the hash computing should be fast enough.

Julien Lebosquain
No, two assemblies can end up with the same hash (pigeonhole principle). To be completely certain, you have to compare them byte by byte.
erikkallen
+1  A: 
Abel
Depends on how you load them http://blogs.msdn.com/suzcook/archive/2003/05/29/57143.aspx
erikkallen
@erik: Correct, I was coming to the same conclusion while testing. I'm editing the answer.
Abel
+1  A: 
Gonzalo
Was really looking for a .net/programmatic answer but I didn't specify such so was my fault there. Definitely good to know though since I wasn't aware of dupfinder. Thanks
JamesEggers
Then I would first compare sizes, then use something like Mono.Cecil to inspect the assembly name/version/signature (if signed) without actually loading the assembly. Then, if the assembly is not signed, just compare byte by byte...
Gonzalo