tags:

views:

426

answers:

4

I'm using Linux and C++. I have a binary file with a size of 210732 bytes, but the size reported with seekg/tellg is 210728.

I get the following information from ls-la, i.e., 210732 bytes:

-rw-rw-r-- 1 pjs pjs 210732 Feb 17 10:25 output.osr

And with the following code snippet, I get 210728:

std::ifstream handle;
handle.open("output.osr", std::ios::binary | std::ios::in);
handle.seekg(0, std::ios::end);
std::cout << "file size:" << static_cast<unsigned int>(handle.tellg()) << std::endl;

So my code is off by 4 bytes. I have confirmed that the size of the file is correct with a hex editor. So why am I not getting the correct size?

My answer: I think the problem was caused by having multiple open fstreams to the file. At least that seems to have sorted it out for me. Thanks to everyone who helped.

+4  A: 
tommieb75
I noticed this was downvoted then upvoted? why?
tommieb75
Probably because it doesn't answer the question
anon
@Neil: Oh...He talked about opening the file and seeking to the end in order to get the size and it returned incorrect results...I was wondering why not use this function instead in having to open/close the file...?
tommieb75
Thanks, I'm opening the file to parse it, so I can check that it contains the correct data. I have tried the above, but it also gives the wrong answer. Also with a const char* argument shouldn't it be stat instead of fstat?
PSJ
PSJ: ooops ... editing this to reflect accordingly...
tommieb75
+1  A: 

At least for me with G++ 4.1 and 4.4 on 64-bit CentOS 5, the code below works as expected, i.e. the length the program prints out is the same as that returned by the stat() call.


#include <iostream>
#include <fstream>
using namespace std;

int main () {
  int length;

  ifstream is;
  is.open ("test.txt", ios::binary | std::ios::in);

  // get length of file:
  is.seekg (0, ios::end);
  length = is.tellg();
  is.seekg (0, ios::beg);

  cout << "Length: " << length << "\nThe following should be zero: " 
       << is.tellg() << "\n";

  return 0;
}
janneb
Thank you. Surprisingly, this actually gives me the correct answer. I don't understand why, but it does provide me with the result I'm looking for.
PSJ
but thats exactly the same code- apart from the static cast to unsigned int
pm100
Yeah, I must have something somewhere, that is interfering. I'm trying to figure it out.
PSJ
@pm100: Yes. I mainly wanted to verify that the libstdc++ for g++ 4.1 and 4.4 in centos 5 x86_64 does not contain such a glaring bug. Rather, there is something fishy with the OP's system.
janneb
I think it must have been caused by having multiple fstreams open to the file.
PSJ
+1  A: 

Is it possible that ls -la is actually reporting the number of bytes the file takes up on the disk, instead of its actual size? That would explain why it is slightly higher.

Frederik Slijkerman
That was my thought too. I'm generating the file myself and I'm putting 210732 bytes into the file, also when I inspect the file with ghex2 it actually contains all the bytes.
PSJ
+1  A: 

When on a flavour of Unix, why do we use that, when we have the stat utlilty

long findSize( const char *filename )
{
   struct stat statbuf;
   if ( stat( filename, &statbuf ) == 0 )
   {
      return statbuf.st_size;
   }
   else
   {
      return 0;
   }
}

if not,

long findSize( const char *filename )
{
   long l,m; 
   ifstream file (filename, ios::in|ios::binary ); 
   l = file.tellg(); 
   file.seekg ( 0, ios::end ); 
   m = file.tellg(); 
   file.close(); 
   return ( m – l );
}
Narendra N