views:

203

answers:

3

I decided to work around a bug in GNU libstdc++ by accessing an internal variable. Recalling that Johannes solved the problem on his blog, I checked that out… but couldn't comprehend the code, aside from the basic concept of getting a static initializer to do the dirty work. So, I boiled it down to this, which is pretty compact.

But, as commented, this results in little objects and accessor functions duplicated per translation unit, causing cascade of nasty. Is there a canonical way to do this, say Boost best practice?

Apologies for the bad humor, but it's not gratuitous… we wouldn't want this code to be "safe for work"!

/* This hack installs a static initializer, so to avoid the ordering fiasco,
make one fresh copy per translation unit, via anonymous namespace. */
namespace {

template< typename T, T value, T &dest >
struct class_rape {
    class_rape() { dest = value; } // you've been raped in the class!
    static class_rape r;
};
template< typename T, T value, T &dest >
class_rape< T, value, dest > class_rape< T, value, dest >::r;


// Usage (cvt_[w]filebuf is a specialization of GCC basic_filebuf)

typedef bool cvt_filebuf::*cvt_fb_reading_t;
typedef bool cvt_wfilebuf::*cvt_wfb_reading_t;

/* Access these variables, or functions accessing them (applies recursively),
only in anonymous namespace or in non-header file, per one-definition rule. */
cvt_fb_reading_t cvt_filebuf_reading;
cvt_wfb_reading_t cvt_wfilebuf_reading;

template struct class_rape
    < cvt_fb_reading_t, &cvt_filebuf::_M_reading, cvt_filebuf_reading >;
template struct class_rape
    < cvt_wfb_reading_t, &cvt_wfilebuf::_M_reading, cvt_wfilebuf_reading >;

}

By the way, here is the context: http://pastie.org/1188625.

Update

I solved the duplication issue in my answer below. So now I'm interested in a deterministic, well-defined solution that doesn't involve editing any targeted code and allows hacking multiple specializations of a template at once. (The given hack requires a new instantiation for each target template specialization.)

+3  A: 

Illegal access:

class ClassIWantToViolate
{
    // Internal State
    public:
        template<typename T> void violate() {} // Do nothing
};

Then in your code you can violate the class like this:

namespace { struct Attack {}; }

template<>
void ClassIWantToViolate::violate<Attack>()
{
     // Access to internal state here.

     // This is your own version of violate based on a local specialization
     // Thus it is unique but still has access to internal state of the class.
}
Martin York
+1 because it basically works, but this is more likely to violate the ODR than what I have. Also, replacing an implementation function will be pretty fragile across versions of targeted code, much more than just assuming a single invariant for a single variable.
Potatoswatter
This works because the C++ access specifiers are not designed to prevent malicious code; they are there to help stop you from violating OO principles (ie provide a compiler verified check that you are not violating the principles). But they are not desined to prevent stupid things like this. This is basically providing a mechanism that extends the public API thus any change to the internal representation of the class will break code you put above (thus a violation in OO). You should only use this if you are part of the public API and authentically modifying the API otherwise things will break.
Martin York
@Martin: There's a difference between violating principles and causing UB. I'm interested in the former without the latter. If defeating the access qualifiers really always resulted in UB, they would in fact be pretty good security. My hack is evil, but it's not stupid and won't have undesirable side-effects like replacing an entire existing member function. (Adding a new member function is obviously not an option; if I could do that I would just go in there and fix the bug.)
Potatoswatter
To be specific, if this hack is placed in a header, and one source file does not include that header, the ODR is violated.
Potatoswatter
Ypu may not be violating the ODR rule but you are violating the "DDSHS" rule: <Don't Do Stupid Hacky Stuff rule>
Martin York
Well, if I know that the variable is used in this particular incorrect way in every version of the library in which it appears, and the correction is semantically valid within the library's design and the hack doesn't invoke UB, and it enables a language feature latent in the library, namely codecvt facets that return data to the user, then the stupidity and the hackishness are quantified and justified.
Potatoswatter
A: 

I would probably go with a less-clever approach. Accessing private members of a class from the outside is such a rare thing, or at least for me (never done it). On my implementation, I was able to do this in a few minutes...

  • copy fstream to a local project file I called evil_fstream.h
  • change the namespace in evil_fstream.h from std to evil
  • delete all instances of private: and protected: from evil_fstream.h

Then this code compiles:

#include <fstream>
#include "evil_fstream.h"

using namespace std;

typedef evil::basic_filebuf<char, char_traits<char> > evil_filebuf_t;

int main() {

   std::basic_filebuf<char, char_traits<char> > fb;

   evil_filebuf_t* efb = (evil_filebuf_t*)&fb;

   efb->_Pcvt; // access a private member

   return 1;
}
dgnorton
That is likely to fail because access qualifiers may affect the layout of the class in memory (C++03 §9/12, or http://stackoverflow.com/questions/3824213/does-order-of-members-of-objects-of-a-class-have-any-impact-on-performance/3824241#3824241).
Potatoswatter
@Potatoswatter, glad you pointed that out. I learned something new.
dgnorton
A: 

Bah, it was too late when I posted this… I should've slept on it.

I can avoid both the static initialization order fiasco and the duplication problem by making the variables truly global, and merely initializing them multiple times. Since the initialization value is the same each time, it doesn't matter when they happen. The first initialization occurs before the first access because initialization appears first in every translation unit where access might occur.

typedef bool cvt_filebuf::*cvt_fb_reading_t;
typedef bool cvt_wfilebuf::*cvt_wfb_reading_t;

/* Place accessible variables in global, non-anonymous namespace. */
cvt_fb_reading_t cvt_filebuf_reading;
cvt_wfb_reading_t cvt_wfilebuf_reading;

/* This hack installs a static initializer, so to avoid the ordering fiasco,
make one fresh copy per translation unit, via anonymous namespace. */
namespace {

template< typename T, T value, T &dest >
class class_rape { // change access qualification of hack to guarantee ODR, LOL
    class_rape() { dest = value; } // you've been raped in the class!
    static class_rape r;
};
template< typename T, T value, T &dest >
class_rape< T, value, dest > class_rape< T, value, dest >::r;

template struct class_rape
    < cvt_fb_reading_t, &cvt_filebuf::_M_reading, cvt_filebuf_reading >;
template struct class_rape
    < cvt_wfb_reading_t, &cvt_wfilebuf::_M_reading, cvt_wfilebuf_reading >;

}

/* Accessor functions go here, also outside anonymous namespace. */
Potatoswatter