views:

651

answers:

4

I've written several ints, char[]s and the such to a data file with BinaryWriter in C#. Reading the file back in (in C#) with BinaryReader, I can recreate all of the pieces of the file perfectly.

However, attempting to read them back in with C++ yields some scary results. I was using fstream to attempt to read back the data and the data was not reading in correctly. In C++, I set up an fstream with ios::in|ios::binary|ios::ate and used seekg to target my location. I then read the next four bytes, which were written as the integer "16" (and reads correctly into C#). This reads as 1244780 in C++ (not the memory address, I checked). Why would this be? Is there an equivalent to BinaryReader in C++? I noticed it mentioned on msdn, but that's Visual C++ and intellisense doesn't even look like c++, to me.

Example code for writing the file (C#):

    public static void OpenFile(string filename)
    {
        fs = new FileStream(filename, FileMode.Create);
        w = new BinaryWriter(fs);

    }

    public static void WriteHeader()
    {
        w.Write('A');
        w.Write('B');
    }

    public static byte[] RawSerialize(object structure)
    {
        Int32 size = Marshal.SizeOf(structure);
        IntPtr buffer = Marshal.AllocHGlobal(size);
        Marshal.StructureToPtr(structure, buffer, true);
        byte[] data = new byte[size];
        Marshal.Copy(buffer, data, 0, size);
        Marshal.FreeHGlobal(buffer);
        return data;
    }

    public static void WriteToFile(Structures.SomeData data)
    {
        byte[] buffer = Serializer.RawSerialize(data);
        w.Write(buffer);
    }

I'm not sure how I could show you the data file.

Example of reading the data back (C#):

        BinaryReader reader = new BinaryReader(new FileStream("C://chris.dat", FileMode.Open));
        char[] a = new char[2];
        a = reader.ReadChars(2);
        Int32 numberoffiles;
        numberoffiles = reader.ReadInt32();
        Console.Write("Reading: ");
        Console.WriteLine(a);
        Console.Write("NumberOfFiles: ");
        Console.WriteLine(numberoffiles);

This I want to perform in c++. Initial attempt (fails at first integer):

 fstream fin("C://datafile.dat", ios::in|ios::binary|ios::ate);
 char *memblock = 0;
 int size;
 size = 0;
 if (fin.is_open())
 {
  size = static_cast<int>(fin.tellg());
  memblock = new char[static_cast<int>(size+1)];
  memset(memblock, 0, static_cast<int>(size + 1));

  fin.seekg(0, ios::beg);
  fin.read(memblock, size);
  fin.close();
  if(!strncmp("AB", memblock, 2)){ 
   printf("test. This works."); 
  }
  fin.seekg(2); //read the stream starting from after the second byte.
  int i;
  fin >> i;

Edit: It seems that no matter what location I use "seekg" to, I receive the exact same value.

+4  A: 

You realize that a char is 16 bits in C# rather than the 8 it usually is in C. This is because a char in C# is designed to handle Unicode text rather than raw data. Therefore, writing chars using the BinaryWriter will result in Unicode being written rather than raw bytes.

This may have lead you to calculate the offset of the integer incorrectly. I recommend you take a look at the file in a hex editor, and if you cannot work out the issue post the file and the code here.

EDIT1
Regarding your C++ code, do not use the >> operator to read from a binary stream. Use read() with the address of the int that you want to read to.

int i;
fin.read((char*)&i, sizeof(int));

EDIT2
Reading from a closed stream is also going to result in undefined behavior. You cannot call fin.close() and then still expect to be able to read from it.

Yacoby
A c/c++ char can handle unicode in the form of a utf-8 string.
anno
+1  A: 

This may or may not be related to the problem, but...

When you create the BinaryWriter, it defaults to writing chars in UTF-8. This means that some of them may be longer than one byte, throwing off your seeks.

You can avoid this by using the 2 argument constructor to specify the encoding. An instance of System.Text.ASCIIEncoding would be the same as what C/C++ use by default.

R. Bemrose
A: 

If it's any help, I went through how the BinaryWriter writes data here.

It's been a while but I'll quote it and hope it's accurate:

  • Int16 is written as 2 bytes and padded.
  • Int32 is written as Little Endian and zero padded
  • Floats are more complicated: it takes the float value and dereferences it, getting the memory address's contents which is a hexadecimal
Chris S
The int32 being little endian and 0 padded, could this be causing some issues? Can you elaborate at all? (Sorry, haven't checked the link yet. it might be elaborated on in there)
Chris
Looks like it was C++ char related and nothing to do with integers, except for the offset
Chris S
+1  A: 

There are many thing going wrong in your C++ snippet. You shouldn't mix binary reading with formatted reading:

  // The file is closed after this line. It is WRONG to read from a closed file.
  fin.close();

  if(!strncmp("AB", memblock, 2)){ 
   printf("test. This works."); 
  }

  fin.seekg(2); // You are moving the "get pointer" of a closed file
  int i;

  // Even if the file is opened, you should not mix formatted reading
  // with binary reading. ">>" is just an operator for reading formatted data.
  // In other words, it is for reading "text" and converting it to a 
  // variable of a specific data type.
  fin >> i;
AraK
Much thanks. I haven't worked with this kind of stuff in a long time and need these pointed out :)
Chris