views:

10844

answers:

6
+18  Q: 

Binary diff tool?

I need a utility to diff two binary files. The files are large (6-50GB)

Beyond Compare is my favorite diff tool, and I own it, but it cannot handle binary files over what can fit in the process's address space.

HexDiff 3.0 seemed interesting, except the trial version doesn't do diff's. *rolls eyes*

  • The tool should be free, since I'm not paying money to figure out that it doesn't work.

  • The tool should be a Windows application.

  • The tool should not be console based.

  • The tool should be graphical (aka a Windows application).

+4  A: 

You can try xdelta. I've never looked for a GUI version but you could try this one (although it appears to be KDE only).

jthompson
+2  A: 

Since the files are so huge and you probably have more than a few differences, the diff is going to be too big to fit into anything Windows applications can run. So my approach would be:

  • Convert the files to text. Use a command line hex dumper or, much more useful, write a small program which understands what the binary data means, so you can compare meaningful data instead of bit wastes.

  • Use a command line diff tool (like the one from cygwin). The GNU command line tools can process arbitrarily large files.

  • Check the result with less. You might argue that you'll want to see all differences but unless you're an alien in human form, your brain can't even hold the contents of a whole screen full of text in its work memory. So if you really want to achieve something, you must reduce the amount of data you have to eyeball.

Aaron Digulla
"Convert the files to text" - That's a *minimum* of doubling the filesize, and much *much* more for common or readable format hex dumps
Draemon
And your point is? Either you have enough disk space or you need to use a pipe.
Aaron Digulla
+9  A: 

Google uses bsdiff, http://www.daemonology.net/bsdiff/

Eder Gusatto
"bsdiff is quite memory-hungry. It requires max(17*n,9*n+m)+O(1) bytes of memory, where n is the size of the old file and m is the size of the new file." In other words: it can't handle large files.
Ian Boyd
+7  A: 

((bsdiff is massivly elite:), other than that)

I personally like vbindiff (SUA mode) for small files and I've beta-tested this tool blockwatch (Windows WPF, free client, cost for network feed), which can do very fast sub-section matching over large content search space's, should be released soon.

If you are diffing (native) executables, PatchDiff2 (tool is free, IDA is$) is an IDA plugin that will get you over 90-95% accuracey no problem, even with variation's in optimization or other build settings.

BinNavi, ($) is another tool which does quite well.

If you want to qualify the similarity of binaries, STAN (works in SUA mode), can cut through the perverial B.S. quickly to get you a safe bet.

Just for completeness sakes, related to bsdiff is Google's new algorythem for their Chrome browser, Courgette seems to have improved bsdiff by a fair amount, it will be nice to see how well it can be adapted to other formats, it seems to hevially leverage an optimized symbol table lookup and what seems to be (have not read the code) an improvement you would get from using based pointers (i.e. not useing liniear addresses, but simply using the offset in as compat a notation as possiable).

RandomNickName42
Has blockwatch materialized yet? Sounds interesting.
jmanning2k
+1  A: 

Your last three requirements make this a hard problem. What would a graphical Windows program offer you that a text-based console program couldn't? So there aren't many tools that do what you want. So I'll ignore the last three, take my karma in my hands, and suggest rdiff. It's text- and console-based. But it can diff binary files of arbitrary size. You can get rdiff for Windows via Cygwin (http://cygwin.com).

Larry Clapp
Yes, the last 3 requirements really are the same thing. i had to break it down because some people might want to nit-pick.
Ian Boyd
+2  A: 

I've been using WinMerge quite happily to show differences in binary files. It's free an open-source too.

Otherwise, your files are very large and may not fit in a diff tool -- have you considered generating a binary patch (e.g. .ppf, Playstation Patch File) and just having a look at that?

Richard Dingwall