views:

116

answers:

1

What is an efficient, proper way of reading in a data file with mixed characters? For example, I have a data file that contains a mixture of data loaded from other files, 32-bit integers, characters and strings. Currently, I am using an fstream object, but it gets stopped once it hits an int32 or the end of a string. if i add random data onto the end of the string in the data file, it seems to follow through with the rest of the file. This leads me to believe that the null-termination added onto strings is messing it up. Here's an example of loading in the file:

void main()
{
    fstream fin("C://mark.dat", ios::in|ios::binary|ios::ate);
    char *mymemory = 0;
    int size;
    size = 0;
    if (fin.is_open())
    {
     size = static_cast<int>(fin.tellg());
     mymemory = new char[static_cast<int>(size+1)];
     memset(mymemory, 0, static_cast<int>(size + 1));

     fin.seekg(0, ios::beg);
     fin.read(mymemory, size);
     fin.close();
     printf(mymemory);
    std::string hithere;
    hithere = cin.get();
    }
}

Why might this code stop after reading in an integer or a string? How might one get around this? Is this the wrong approach when dealing with these types of files? Should I be using fstream at all?

+3  A: 

Have you ever considered that the file reading is working perfectly and it is printf(mymemory) that is stopping at the first null?

Have a look with the debugger and see if I am right.

Also, if you want to print someone else's buffer, use puts(mymemory) or printf("%s", mymemory). Don't accept someone else's input for the format string, it could crash your program.

Try

for (int i = 0; i < size ; ++i)
{
  // 0 - pad with 0s
  // 2 - to two zeros max
  // X - a Hex value with capital A-F (0A, 1B, etc)
  printf("%02X ", (int)mymemory[i]);
  if (i % 32 == 0)
    printf("\n"); //New line every 32 bytes
}

as a way to dump your data file back out as hex.

Tom Leys
Cool stuff. I did cout << mymemory[i] and i'm seeing the data...ish. thanks for confirming that. if i knew the first 4 characters in the file were "mark" and i wanted to confirm that before reading the rest of the file, is there an easy parsing method in c++ dealing with this awkward data for that?
Mark
Actually, that would be another question. Accepting this now.
Mark
@Mark: sure. `if(strncmp("mark", memory, 4)`. Obviously, if your 'magic number' is something other than `"mark"`, you'll need the change that appropriately, possibly to a character array with numeric entries not corresponding to easily-typed characters.
Novelocrat