views:

235

answers:

4

What is the best way to learn about twiddling with binary data?

Mainly what I'm referring to here is learning to reading/writing file existing file formats that are in binary.

I saw one of my co-workers today trying to learn how to create a flash file from scratch, and he was writing a whole bunch of ones and zeros, but I'm guessing that maybe there's a better way to visualize the contents of a file. (This question isn't really about flash movies but more like just how to manipulate and interpret binary files in general)

If I recall correctly there's also a thing that they do around here, that involves setting and unsettling values using the & and | operators.

Aside from the cliche answer that I would think of, which would be to work with octal or hex. If that answer is right, then, where would be a good guide to show you how to get started?

+1  A: 

When studying a new binary file format, the first thing I do is to do a hexdump of the file. In Unix, the hexdump -C command tends to work for me. Your mileage may vary, especially on other platforms.

Chris Jester-Young
Does that work in Linux? or Cygwin?
leeand00
It works on both. :-)
Chris Jester-Young
+3  A: 

How to crack a Binary File Format:
http://www.iwriteiam.nl/Ha_HTCABFF.html

Bitwise operations (&, |, etc):
http://en.wikipedia.org/wiki/Bitwise_operation http://www.gamedev.net/reference/articles/article1563.asp

gkrogers
I noticed that my boss was working on some stuff like this. He cracked the flash format apparently, because he is now writing his own flash files; which I thought was pretty cool! Also here's a link I found http://nickciske.com/tools/binary.php
leeand00
+2  A: 

You may also be interested in this excellent article - http://graphics.stanford.edu/~seander/bithacks.html

Chetan Sastry
You may want to link the article...
Matt Jordan
+3  A: 

As others have pointed out, your question is a bit odd. I'm not entirely sure what you're asking - as gkrogers pointed out, there's plenty of reference material on managing bitwise operations, looking at a binary format, etc. I'm interpreting your question to mean how do you conceptually view a binary file. I routinely deal with large amounts (gigabyte files) of data in a binary format. The following is a few approaches I've used to viewing / manipulating the data.

Build a conceptual model

Most data when stored in some format has some logical order to it. My data, for example, is organized into major frames, each of which contains a minor frame of data. Each minor frame has a fixed width, where certain fields in the minor frame have a specific purpose. Drawing this concept on a whiteboard makes it much easier to deal with once you start writing code and / or viewing the data in its raw format.

This is obviously impossible if you don't know what your model is. If you don't know that, then either get that information or work towards it. Manipulating bytes without knowing what they are isn't going to get you anywhere (is that a timestamp I've just read? A piece of video data? Maybe a checksum? Who knows?)

Get a Binary Editor that works for you

For small binary files, you can view the raw bytes directly in Visual Studio. For larger files, you may want to use an editor that can load the data directly and organize the data in such a manner that the view represents your model. In my case, I didn't really find a good tool for this - so I wrote one. An hour or so in any language and you should be able to write a simple application that pulls some data from a binary file and formats it so that you can see the raw bytes in a manner that makes sense with your model. In my case, it scans through the binary file looking for a major frame, and displays the minor frames in that major frame, organized such that the bytes shown are the length of the minor frame. Very handy when lining up the data.

If you don't want to write your own - I would, if only because its a useful exercise to start with - here's a few binary editors:

XVI32

HexEdit

010Editor

As I said, I wrote my own as it helped me work with my conceptual model, checked that I understood it correctly, and was custom tailored to my problem. Using a free one isn't a bad way to go however.

Work with the Data

Once you understand your data model, working with the data is fairly trivial. Say I wanted to read a binary file in 128 byte chunks, since a logical frame (maybe a minor frame, in my example) is 128 bytes long. Here's some trivial C# code that reads some binary data (not that I would rely on this - its merely an example):

    private void ReadSomeData()
    {
        // Read data in 128 byte chunks
        int frameSize = 128;
        BinaryReader binReader = new BinaryReader(File.Open("C:\\Test.bin", FileMode.Open));
        while (
            binReader.PeekChar() != -1
            )
        {
            byte[] receivedData = binReader.ReadBytes(frameSize);

            HandleData(receivedData); // do something with it, someplace

        }

        // End of File reached
    }
Matt Jordan
Is a frame in a binary file similar to a frame within a network packet?
leeand00
You're comparing apples... to an unknown. A binary file is just that - a file with binary values. You have to have some knowledge of what the content is to do anything meaningful with it. You could always read it - but what is 0xA0 0xFF 0xE0 0x01? You have to have some context to understand it.
Matt Jordan