views:

135

answers:

3

I have a chunk of fairly random binary data. I want to find where that chunk exists in a file, how many times it occurs, and at what byte (or sector) offsets. Any ideas on how to do that?

Thanks, Justin

+2  A: 

I would recommend X-Ways WinHex for that. I find myself using it quite often to search arbitrary data on hard disk drives or large disk image files.

Jonas Gulle
+1, nice tool even if it is Win-centric ;-)
DCookie
+3  A: 

I believe that no existing command does exactly what you want. If your chunk is small and your file fits in memory, it's easy to write your own. Just scan through the file contents, applying strncmp at each position.

If your file is very large but still fits in your address space, you can do the same thing with mmap.

If your chunk is not small, you'll probably be better off using the Boyer-Moore algorithm instead of strncmp. This is still not too much work since there are already implementations out there that you can use.

Nathan Kitchen
+1, nifty algorithm
DCookie
This is what I ended up doing, with mmap and memcmp. It works, but I was thinking there really ought to be a command that does this already.
Justin
A: 

You can do some of this with grep

This outputs lines with the byte offset

grep --text --byte-offset 'ls' /bin/ls

Add a --count parameter to get the total number of matches.

Paul Lindner
I did this as well, but the thing is I have a file that contains the chunk. I can't find a way to make grep search for the contents of one file in another file.
Justin