views:

580

answers:

3

Hi,

I have the following code:

using (BinaryReader br = new BinaryReader(
       File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
    int pos = 0;
    int length = (int) br.BaseStream.Length;

    while (pos < length)
    {
        b[pos] = br.ReadByte();
        pos++;
    }

    pos = 0;
    while (pos < length)
    {
        Console.WriteLine(Convert.ToString(b[pos]));
        pos++;
    }
}

The FILE_PATH is a const string that contains the path to the binary file being read. The binary file is a mixture of integers and characters. The integers are 1 bytes each and each character is written to the file as 2 bytes.

For example, the file has the following data :

1HELLO HOW ARE YOU45YOU ARE LOOKING GREAT //and so on

Please note: Each integer is associated with the string of characters following it. So 1 is associated with "HELLO HOW ARE YOU" and 45 with "YOU ARE LOOKING GREAT" and so on.

Now the binary is written (I do not know why but I have to live with this) such that '1' will take only 1 byte while 'H' (and other characters) take 2 bytes each.

So here is what the file actually contains:

0100480045..and so on Heres the breakdown:

01 is the first byte for the integer 1 0048 are the 2 bytes for 'H' (H is 48 in Hex) 0045 are the 2 bytes for 'E' (E = 0x45)

and so on.. I want my Console to print human readable format out of this file: That I want it to print "1 HELLO HOW ARE YOU" and then "45 YOU ARE LOOKING GREAT" and so on...

Is what I am doing correct? Is there an easier/efficient way? My line Console.WriteLine(Convert.ToString(b[pos])); does nothing but prints the integer value and not the actual character I want. It is OK for integers in the file but then how do I read out characters?

Any help would be much appreciated. Thanks

+8  A: 

I think what you are looking for is Encoding.GetString.

Since your string data is composed of 2 byte characters, how you can get your string out is:

for (int i = 0; i < b.Length; i++)
{
  byte curByte = b[i];

  // Assuming that the first byte of a 2-byte character sequence will be 0
  if (curByte != 0)
  { 
    // This is a 1 byte number
    Console.WriteLine(Convert.ToString(curByte));
  }
  else
  { 
    // This is a 2 byte character. Print it out.
    Console.WriteLine(Encoding.Unicode.GetString(b, i, 2));

    // We consumed the next character as well, no need to deal with it
    //  in the next round of the loop.
    i++;
  }
}
paracycle
You'll need to read the first "id" byte separately, then translate the rest of the bytes using the proper encoding.
tvanfosson
Oh, I missed that bit of the question. I will edit my answer.
paracycle
How does the code determine where the first string ends? Without that information, you wont't know when to search for the next number.
Alfred Myers
It doesn't, it reads the array byte by byte until it hits a 0 byte which it assumes to be the start of a 2 byte character sequence. After that it consumes the next 2 bytes and checks the next byte to see if it is also the first byte of a 2 byte character sequence, if not it assumes it is an integer and so on.
paracycle
Oh yeah... Now I see... When reading the code I skipped the (b, i, 2) part. That'll work as long he doesn't have any characters above 0xFF which is reasonable to infer given the example. +1 for you.
Alfred Myers
Thanks; and, yes there is a little assumption going on given the info we have been supplied but I tried to make the assumptions as obvious as I can in the answer.
paracycle
@Paracycle :Thanks. your code seems to work fine with a little modifictions here and there. But no change in the logic. Thanks a lot man
VP
A: 

You can use String System.Text.UnicodeEncoding.GetString() which takes a byte[] array and produces a string. I found this link very useful

Jacob Seleznev
You really ought to add a summary so your answer can stand on its own. It's not my downvote, but I can certainly understand why someone thought it wasn't helpful.
tvanfosson
A: 
using (BinaryReader br = new BinaryReader(File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{    
   int length = (int)br.BaseStream.Length;    

   byte[] buffer = new byte[length * 2];
   int bufferPosition = 0;

   while (pos < length)    
   {        
       byte b = br.ReadByte();        
       if(b < 10)
       {
          buffer[bufferPosition] = 0;
          buffer[bufferPosition + 1] = b + 0x30;
          pos++;
       }
       else
       {
          buffer[bufferPosition] = b;
          buffer[bufferPosition + 1] = br.ReadByte();
          pos += 2;
       }
       bufferPosition += 2;       
   }    

   Console.WriteLine(System.Text.Encoding.Unicode.GetString(buffer, 0, bufferPosition));

}

LorenVS
I am getting the following compiler errors when I try using your code at the line buffer[bufferPosition + 1] = b + 0x30; :error CS0266: Cannot implicitly convert type 'int' to 'byte'. An explicit conversion exists (are you missing a cast?)
VP
Also I checked the value of the length variable. It is including the count of the zeros. So I dont think theres a need to multiply it by 2 initially as you have done.
VP
Sorry, I forgot to cast the hex value, that line should bebuffer[bufferPosition + 1] = b + (byte)0x30;You do, however, need to multiply the buffer length by 2, as the overall size of the array could double if the entire input is integers
LorenVS