views:

494

answers:

8

Our QA team wants to focus their testing based on what EXEs and DLLs have actually changed between builds. We have a nice svn change report, but the relationship between source and changed binaries isn't always obvious. The builds we're comparing are always full clean builds, so we can't use file system timestamps. I'm looking for tools to compare windows (and windows CE) PE binaries that will ignore the embedded timestamps and other cruft. Any recommendations for tools or other ways to generate a reliable 'what binaries have really changed' report? Thanks.

clarification: Thanks for the answers so far, but we can't generate the report by doing straightforward byte-for-byte compares or comparing checksums, because all the files appear different every time we build, even if the sources haven't changed, because of timestamps that the compiler inserts. The problem is how to ignore the false positives. The disassemble & compare idea is closest to what we need, I think...

answered! Bindiff is just what I was looking for. Many thanks.

+1  A: 

You could perhaps disassemble the binary, and then do a diff on the assembly...

This sounds like your QA team is taking the wrong approach though... It shouldn't matter to them what the code looks like; just that it does what it's supposed to do.

Edit: Oh! After reading it again, I realized that I misinterpreted your question. I thought they wanted to test the methods that had changed...

In that case, why not get the MD5 hash and compare those? The tiniest change will cause a totally different hash to be generated.

Ryan Fox
well yes, but I think the QA team only want to spend time testing the binaries which have changed, hence the need for a diff tool
nickf
A: 

When I was working on the "home grown" tool for installation verification at my company, we used Beyond Compare as a backend for comparison.

It has great file/folder comparison (binary as well) and scripting capabilities and can output XML reports.

moose-in-the-jungle
A: 

Project dependency graph generator and Dependency-Grapher for C++-Projects both use GraphViz to visualize dependencies. I imagine that you could use either of them as a basis for your needs with special highlighting of the branches in the dependency graph where source files or other leaves have changed.

MD5 hashes or checksums (as suggested above), a simple diff ignoring whitespace and filtering out comment changes, or changlist information from your version control system can signal which files have changed.

Nathan
+1  A: 

Not sure what kind of binaries (DLLs? PE/WinCE executables only? Other?)Is it possible to embed version information in the binaries, e.g. using a source control tag that updates the version in the source code on commits. Then when the new build is created, the binary would have it's version string updated as well. Instead of having to diff a binary file, you could use the version string and check that for changes.

Jay
we considered version numbers. Automatically updating the version numbers still requires mapping source changes to binaries (which is hard). Manually updating the version numbers is too error-prone. (i.e. I don't trust the devs to do it reliably. :-))
Pete Richardson
A: 

gnu binutils specifically strings

Nathan
+2  A: 

I ran into this problem before. My solution was to write a tool which set all the timestamps in an .EXE/.DLL to a known value. I would run it as a post-build step. Then binary diffs would work just fine.

Ferruccio
+1  A: 

Look at NDepend.

Conrad
+2  A: 

Have you had a look at Bindiff?

ayaz
Unfortunately that does not seem to work for managed files built in Visual Studio. Even if I compile a dll in VS2010, rename that file to .old.dll and then compile again (without changing version info!) and then do "bindiff.exe /v file.dll file.old.dll" then it says the files differ. Doing a hex compare confirmes that. Wonder what the reason for that is?
Kobus Smit