views:

100

answers:

2

My company's products run on a number of qualified Linux hardware/software configurations. Historically, the compiler used has been GNU C++. For purposes of this post, let's consider version 3.2.3 the baseline, as our software 'worked as expected' through that version.

With the introduction of a newer qualified platform, using GNU C++ version 3.4.4, we began to observe some performance problems which we had not seen before. After some digging, one of our engineers came up with this test program:

#include <fstream>
#include <iostream>

using namespace std;

class my_filebuf : public filebuf
{
public:

   my_filebuf() : filebuf(), d_underflows(0) {};
   virtual ~my_filebuf() {};

   virtual pos_type seekoff(off_type, ios_base::seekdir,
                            ios_base::openmode mode = ios_base::in | ios_base::out);

   virtual int_type underflow();

public:
   unsigned int d_underflows;
};

filebuf::pos_type my_filebuf::seekoff(
   off_type           off,
   ios_base::seekdir  way,
   ios_base::openmode mode
)
{
   return filebuf::seekoff(off, way, mode);
}

filebuf::int_type my_filebuf::underflow()
{
   d_underflows++;

   return filebuf::underflow();
}

int main()
{
   my_filebuf fb;
   fb.open("log", ios_base::in);
   if (!fb.is_open())
   {
      cerr << "need log file" << endl;
      return 1;
   }

   int count = 0;
   streampos pos = EOF;
   while (fb.sbumpc() != EOF)
   {
      count++;

      // calling pubseekoff(0, ios::cur) *forces* underflow
      pos = fb.pubseekoff(0, ios::cur);
   }

   cerr << "pos=" << pos << endl;
   cerr << "read chars=" << count << endl;
   cerr << "underflows=" << fb.d_underflows << endl;

   return 0;
}

We ran it against a log file of approximately 751KB chars. In the previous configurations, we got the result:

$ buftest
pos=768058
read chars=768058
underflows=0

In the newer version, the result is:

$ buftest
pos=768058
read chars=768058
underflows=768059

Comment out the pubseekoff(0, ios::cur) call and the excessive underflow() calls go away. So clearly, in newer versions of g++, calling pubseekoff() 'invalidates' the buffer, forcing a call to underflow().

I've read the standards document, and the verbiage on pubseekoff() is certainly ambiguous. What is the relationship of the underlying file pointer position to that of gptr(), for instance? Before or after a call to underflow()? Regardless of this, I find it irritating that g++ 'changed horses in midstream', so to speak. Moreover, even if a general *seekoff()* required invalidating the buffer pointers, why should the equivalent of ftell()?

Can anyone point me to a discussion thread amongst the implementors which led up to this change in behavior? Do you have a succinct description of the choices and tradeoffs involved?

Extra Credit

Clearly I really don't know what I'm doing. I was experimenting to determine if there was a way, however non portable, to bypass the invalidation in the case where offset is 0 and seekdir is ios::cur. I came up with the following hack, directly accessing the filebuf data member *_M_file* (this only wanted to compile with the 3.4.4 version on my machine):

int sc(0);
filebuf::pos_type my_filebuf::seekoff(
   off_type           off,
   ios_base::seekdir  way,
   ios_base::openmode mode
)
{
   if ((off == 0) && (way == ios::cur))
   {
      FILE *file =_M_file.file();
      pos_type pos = pos_type(ftell(file));

      sc++;
      if ((sc % 100) == 0) {
         cerr << "POS IS " << pos << endl;
      }

      return pos;
   }

   return filebuf::seekoff(off, way, mode);
}

However, the diagnostic to print out the position every hundred seekoff attempts yields 8192 every time. Huh? Since this is the FILE * member of the filebuf itself, I'd expect it's file position pointer to be in synch with any underflow() calls made by the filebuf. Why am I wrong?

Update

First, let me emphasize that I understand this later part of my post is all about non-portable hacks. Still, not understanding the nitty-gritty here. I tried calling

pos_type pos = _M_file.seekoff(0,ios::cur);

instead, and this happily progresses through the sample file, rather than getting stuck at 8192.

Final Update

Internally to my company, we've made some workarounds that reduce the performance hit enough we can live with it.

Externally, David Krauss filed a bug against GNU's libstdc++ streams, and recently, Paolo Carlini checked in a fix. The consensus was that the undesired behavior was within the scope of the Standard, but that there was a reasonable fix for the edge case I described.

So thanks, StackOverflow, David Krauss, Paolo Carlini, and all the GNU developers!

+1  A: 

Well, I don't know the exact reason for the change, but apparently the changes were done for (See the GCC 3.4 Series Changelog):

  • Streamlined streambuf, filebuf, separate synched with C Standard I/O streambuf.
  • Large File Support (files larger than 2 GB on 32-bit systems).

I suspect that large file support is the big feature that would require a change like this, because IOStreams can no longer assume it can map the whole file into memory.

Correct syncing with cstdio also is an operation which might require a greater number of flushes to the disk. You can disable that using std::sync_with_stdio.

Billy ONeal
I tried setting std::sync_with_stdio(false), but it made no difference to the results on either platform. Thanks for the suggestion anyway.
Don Wakefield
+1  A: 

The requirements of seekoff certainly are confusing, but seekoff(0, ios::cur) is supposed to be a special case that doesn't synchronize anything. So this could probably be considered a bug.

And it still happens in GCC 4.2.1 and 4.5…

The problem is that (0, ios::cur) is not special-cased in _M_seek, which seekoff uses to call fseek to obtain its return value. So long as that succeeds, _M_seek unconditionally calls _M_set_buffer(-1);, which predictably invalidates the internal buffer. The next read operation causes underflow.

Found the diff! See change -473,41 +486,26. Comment was

    (seekoff): Simplify, set _M_reading, _M_writing to false, call
    _M_set_buffer(-1) ('uncommitted').

So this wasn't done to fix a bug.

Filed bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45628

Potatoswatter
And it's still happening in the latest versions? So given our platform requirements, we won't get relief for a *long* time! You say "seekoff(0,ios::cur) is supposed to be a special case..." This certainly makes sense to me, but I don't recall seeing any discourse on this in the standard. Do you have a cite? Thanks.
Don Wakefield
P.S. - I had already read the section of the standard cited in the bug you filed (§27.8.1.4/11) and thought that "Next, seek..." made it unconditional, regardless of (o,ios::curr).
Don Wakefield
Also, it looks as if, from the comments in the bug posted by Paolo Carlini, that he is ultimately concluding that they *must* invalidate, due to their implementation of the relationship of the file pointer and the gptr(). So it sounds like we just aren't supposed to ever ask the position when only reading, if we want reasonable performance...
Don Wakefield
I added a comment to the bug in response to comment #5, explaining why I *thought* they could special-case it. I hope I got it right!
Don Wakefield
Don Wakefield
@Don: Well, maybe more reasons to upgrade will result in a faster upgrade process :v) . There's a lot of work to do on that code, and it also looks like there's a lot of iostreams activity in Boost development lately.
Potatoswatter