tags:

views:

1444

answers:

3

I am writing a C library that reads a file into memory. It skips the first 54 bytes of the file (header) and then reads the remainder as data. I use fseek to determine the length of the file, and then use fread to read in the file.

The loop runs once and then ends because the EOF is reached (no errors). At the end, bytesRead = 10624, ftell(stream) = 28726, and the buffer contains 28726 values. I expect fread to read 30,000 bytes and the file position to be 30054 when EOF is reached.

C is not my native language so I suspect I've got a dumb beginner mistake somewhere.

Code is as follows:

const size_t headerLen = 54;

FILE * stream;
errno_t ferrno = fopen_s( &stream, filename.c_str(), "r" );
if(ferrno!=0) {
  return -1;
}

fseek( stream, 0L, SEEK_END );
size_t bytesTotal = (size_t)(ftell( stream )) - headerLen; //number of data bytes to read
size_t bytesRead = 0;
BYTE* localBuffer = new BYTE[bytesTotal];
fseek(stream,headerLen,SEEK_SET);
while(!feof(stream) && !ferror(stream)) {
    size_t result = fread(localBuffer+bytesRead,sizeof(BYTE),bytesTotal-bytesRead,stream);
    bytesRead+=result;
}


Depending on the reference you use, it's quite apparent that adding a "b" to the mode flag is the answer. Seeking nominations for the bonehead-badge. :-)

This reference talks about it in the second paragraph, second sentence (though not in their table).

MSDN doesn't discuss the binary flag until halfway down the page.

OpenGroup mentions the existance of the "b" tag, but states that it "shall have no effect".

+13  A: 

perhaps it's a binary mode issue. Try opening the file with "r+b" as the mode.

EDIT: as noted in a comment "rb" is likely a better match to your original intent since "r+b" will open it for read/write and "rb" is read-only.

Evan Teran
Pseudo +1 (I'm out of votes)
+1 for Mike F and myself. Windows bites me with +b all the time.
sixlettervariables
I would suggest trying "rb" first, as "r+b" opens file file for reading and writing, and if you're not intending to write to the file you should continue to open it as read-only.
Greg Hewgill
This is the answer. "rb" works. Another case of RTFM. First paragraph says this as well: http://www.cplusplus.com/reference/clibrary/cstdio/fopen.html
James Schek
Wow, isn't it amazing we still have to deal with these binary vs. text file issues in 2008. I bet if you look in your file at position 10624 you will see 0x1B (decimal 27) which is the end-of-file character.
David Smith
@David Smith: 0x1B is the Escape character. The old DOS end-of-file character was 0x1A (^Z).
Greg Hewgill
A: 

I agree with Evan that it's probably a binary mode issue. However, I'm pretty sure the C standard does not guarantee that ftell as you're using it will return the actual length of the file. I seem to recall that it is required to return a "token" that will get you back to the same position in the file if you pass it to fseek.

Paul Tomblin
I had the same thought... According to this reference, it is the actual position: http://www.opengroup.org/onlinepubs/009695399/functions/ftell.html
James Schek
A: 

Also worth noting that simply including binmode.obj into your link command will do this for you for all file opens.

Richard Harrison