views:

448

answers:

3

I have a binary file. i stored it in byte array. file size can be 20MB or more. then i want to parse or find particular value in the file. i am doing it by 2 ways -> 1. By converting full file in char array. 2. By converting full file in hex string.(i also have hex values)

what is best way to parse full file..or should i do in binary form. i am using vs-2005.

A: 

From the aspect of memory consumption, it would be best it you could parse it directly, on-the-fly.

Converting it to a char array in C# means effectively doubling it's size in memory (presuming you are converting each byte to a char), while hex string will take at least 4 times the size (C# chars are 16-bit unicode characters).

On the other hand, it you need to make many searches and parsing over an existing set of data repeatedly, you may benefit from having it stored in any form which suits your needs better.

Groo
A: 

What's stopping you from seaching in the byte[]? IMHO, If you're simply searching for a byte of specified value, or several continous bytes, this is the easiest way and most efficient way to do it.

deerchao
i don't know the byte value of search string.
Royson
Convert string into Byte[] with Encoding.GetBytes();Use shifting operator(<< or >>) to convert integers into bytes;
deerchao
A: 

If I understood your question correctly you need to find strings which can contain any characters in a large binary file. Does the binary file contain text? If so do you know the encoding? If so you can use StreamReader class like so:

     using (StreamReader sr = new StreamReader("C:\test.dat", System.Text.Encoding.UTF8))
        {

            string s = sr.ReadLine();
        }

In any case I think it's much more efficient using some kind of stream access to the file, instead of loading it all to memory. You could load it by chunks into the memory, and then use some pattern matching algorithm (like Knuth-Moris-Pratt or Karp-Rabin)

RA