views:

205

answers:

7

Is there a way to detect whether two EXE's (compiled from VS.Net 2008 for C++/MFC) do not have any code-level changes between them i.e. for purposes of knowing that there have been no statement changes.

This is for compliance purposes when my vendor ships me an exe, ostensibly with no changes made to the code since the last time we tested it.

Is there a tool to check that this is so?

Cheers

+3  A: 

You could use a dissembler to disassemble the executable back into assembly and compare with a normal text diff tool.

But even that will not be 100% accurate. The compilation process is not lossless and much information is lost or irreversibly transformed when C++ code is compiled.

In particular, different compiler settings can generate vastly different machine code from exactly the same source. Different compilers and even different versions or service-pack/hotfix levels of the same compiler can produce vastly different machine code from the same source files.

The other question is, why are they even sending you the exe back "ostensibly with no changes made"? If that's the case, why don't you just use the one you had originally?

Dean Harding
+1 for if they can't justify an update, keep using the tested/certified version
Ben Voigt
If the vendor of that EXE is Microsoft you cannot use disassembler according to EULA.
Kirill V. Lyadvinsky
@Kirill: "for compliance purposes" suggests a legal obligation, which usually trumps contractual terms. A Microsoft EULA cannot force you to break the law. (IANAL so check with your local compliance expert)
MSalters
+1  A: 

You can always perform an MD5sum on the executables. This won't tell you whether they are logically equivalent or different, simply that a difference exists.

I'm not sure if this solves your issue, as you may be looking for a logical comparison tool.

Ryan
Using MD5 *always* generate different checksums even when we compile the same project. I guess the linker is adding some time-specific stuff into the exe's.
yumcious
+1  A: 

If you are in control of the source simply don't ship out exes that don't have proper versioning info associated with them.

If for some reason they build their own exes, I would suggest having a build step that they have to use that will embed the version control revision number into the versioning info.

If they don't use your build step (which you can detect) then you assume they are different.

Most revision control systems (such as SVN for example) will allow you to have a build step that will say whether the code is in a modified state or not. You could have this info embedded into a string in an embedded resource for the exe. You would then just extract that resource.

So it comes down to make sure all builds happen from your custom build script.

Brian R. Bondy
The scenario is as follows. The app is supposed to be language-independent i.e. the strings are using resources (not counted as logic in our view). In the initial tests, we are using English, and the vendor will deliver other language versions later, but embedded as string resources.We just needed to make sure that there were no code changes when they delivered the new exe's to us. We are okay if there were changes to comments (not in the exe anyway, renamed variables), or resources.
yumcious
@yumcious: Use something like gettext to mark your strings. Then create language files with it to deliver to your client. Your client will then send you back just the translated language files. That way you can develop independent of the language files and they only need to translate once for each unique string that appears.
Brian R. Bondy
@Brian, cool, thanks for the tip. I'll pass it along.
yumcious
+1  A: 

Automate your testing so that the tests can be rerun quickly.

Even though this is a small statement to make, it is a big undertaking

benPearce
Sometimes the vendor shipped the incorrect resourced (version, copyright, etc) to us for testing. Then they *suddenly* remembered that they needed to correct these JUST AFTER we've completed the acceptance testing.We just need a mechanism to prove that this vendor did not slip in any code unnoticed into production use.
yumcious
+2  A: 

For binary auditing, one of the hands-down best tools that you must have is the Interactive Disassembler, also better known as IDA Pro. It is a must have when you need to audit without access to the source code. Someone proficient in using IDA Pro will be able to tell you, with a reasonable amount of confidence, if there have been anything more than superficial changes to the source code. In this context, superficial changes would encompass things like variable renaming within the source files or changing the order of variable, function or class declarations and definitions. They will be able to tell you if the basic code blocks making up the executables have differences between them substantial enough to be flagged as suspicious, in the sense that there is a high probability that the differences are indicative of a source-level difference.

I say more or less, because there are several ways in which two executables generated from the exact same source tree could still have subtle, and on occasion not so subtle, differences between each other. Factors that can influence the generation of the executables include:

  • compiler optimization settings
  • differing versions of libraries the executables are linked with
  • changes to header files, external to the source tree used to build the executables, which were included by the C++ preprocessor before the compilation step
  • an executable that manipulates its own code at runtime, which may include decompressing or decrypting on the fly some part of itself into some area of memory it can jump to

And this list could go on for a while.

Is the sort of binary audit you are suggesting possible? Yes, a person with enough knowledge and skill could do this. Hackers do this all the time. And if the person doing the analysis is good enough, they will be able to tell you exactly how confident they are in their assessment as well.

Ultimately it becomes a question of feasibility. How much are you willing to spend on this audit? Hiring or contracting someone who can do this may push beyond what is budgeted for such auditing, is there enough money to do this? How complex is the software you are testing? What is the nature of your relationship with your vendor?

That last question is important because if it is in their best interest to pass this audit, and they realize this, they may be willing to assist you up to a certain degree. This could come in the form of debugging symbols, a list of compiler options which were used or some other artifacts of the build process they are willing to disclose. The preceding can all be very helpful in any analysis where source code is not being made available for the purposes of analysis due to whatever reason. And if access to the source code is available for such a purpose, things become an order of magnitude easier to analyze.

If this is something you would like to pursue on your own, two books I would recommend are The IDA Pro Book: The Unofficial Guide to the World's Most Popular Disassembler by Chris Eagle and The Shellcoder's Handbook: Discovering and Exploiting Security Holes by Chris Anley, John Heasman, Felix Linder and Gerardo Richarte.

Lastly, the techniques and tools developed for analysis of the kind that would help you are still very much active areas of research. Your question either runs deeper than you may realize, or it may have been misunderstood by me. A thorough treatment of your question, even from just a practical standpoint and ignoring the theory that goes along with it, could, and does, fill many books.

I hope you find at least some part of this helpful. Good luck!

rixin
The relationship with the vendor is not a complete trust as this is a financial payment system. It needs to pass a suite of tests before deployment. We don't really want to redo all the tests if they "forgot" to add the appropriate version info resource, etc, etc.If there was a change to the code, then we are within our right to reject this new exe, or to penalise the vendor for delays.
yumcious
+1  A: 

From now on, add a post build step that will generate a MD5 of the sources files and add it to the VERSION resource (so that you can see it in the exe properties).
It will cost you 2 or 3 man days.

FenchKiss Dev
+1  A: 

Load up the exes in a hex compare program (BeyondCompare rocks!).

If there are any non-trivial changes (assuming compiler settings have not changed), they should be pretty easy to pick up. If it's just a matter of timestamps, etc. it may be pretty obvious.

This definitely isn't foolproof, but it would be my first step.

Computer Guru