views:

184

answers:

7

Ok, so I know that technically this is undefined behavior, but nonetheless, I've seen this more than once in production code. And please correct me if I'm wrong, but I've also heard that some people use this "feature" as a somewhat legitimate substitute for a lacking aspect of the current C++ standard, namely, the inability to obtain the address (well, offset really) of a member function. For example, this is out of a popular implementation of a PCRE (Perl-compatible Regular Expression) library:

#ifndef offsetof
#define offsetof(p_type,field) ((size_t)&(((p_type *)0)->field))
#endif

One can debate whether the exploitation of such a language subtlety in a case like this is valid or not, or even necessary, but I've also seen it used like this:

struct Result
{
   void stat()
   {
      if(this)
         // do something...
      else
         // do something else...
   }
};

// ...somewhere else in the code...

((Result*)0)->stat();

This works just fine! It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right? So the question remains: Is there a practical use case, where one would benefit from using such a construct? I'm especially concerned about the second case, since the first case is more of a workaround for a language limitation. Or is it?

PS. Sorry about the C-style casts, unfortunately people still prefer to type less if they can.

+7  A: 

The first case is not calling anything. It's taking the address. That's a defined, permitted, operation. It yields the offset in bytes from the start of the object to the specified field. This is a very, very, common practice, since offsets like this are very commonly needed. Not all objects can be created on the stack, after all.

The second case is reasonably silly. The sensible thing would be to declare that method static.

bmargulies
I'm aware of that. But is this common practice? Why would one do this, instead of just creating an object on the stack? Is is really worth the fancy syntax? Or are we super-optimizing something that I'm missing? :)
zdawg
Why do people write bad code? Only the original author would be able to explain the motivation behind this abuse of the language.
Lars
Got a reference to where it is defined and permitted? (Not that I doubt you, but it's always nice to have a proper source)
jalf
`E1->E2` is equivalent to `(*(E1)).E2`. `*E1` dereferences the pointer `E1`. Dereferencing a null pointer results in undefined behavior. Ergo, I don't see how the `offsetof` macro in the OP consists of defined, permitted behavior. Is there something I am missing? (the examples with `E1` and `E2` are from the C++ standard).
James McNellis
bmargulies
C99 does say concerning the ` I agree that given this common implementation of `offsetof`, I'm probably wrong; I just can't find where the documents say that I'm wrong. Hopefully someone else knows better. :-/
James McNellis
+1  A: 

Undefined behaviour is undefined behaviour. Do this tricks "work" for your particular compiler? well, possibly. will they work for the next iteration of it. or for another compiler? Possibly not. You pays your money and you takes your choice. I can only say that in nearly 25 years of C++ programming I've never felt the need to do any of these things.

anon
It works on GCC 4.3.3 and VS 2005. I know it's undefined behavior, but since I've seen it a few times now, I began to wonder whether it's something that people often do, and if so, why?
zdawg
+1  A: 

Although libraries targeting specific C++ implementations may do this, that doesn't mean it's "legitimate" generally.

This works just fine! It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right?

No, because although it might work fine on some C++ implementations, it is perfectly okay for it to not work on any conforming C++ implementation.

Daniel Earwicker
+2  A: 

Dereferencing a null-pointer is undefined behavior and anything can happen if you do it. Don't do it if you want a program that works.

Just because it doesn't immediately crash in one specific test case doesn't mean that it won't get you into all kinds of trouble.

sth
+3  A: 

I don't see any benefit of ((Result*)0)->stat(); - it is an ugly hack which will likely break sooner than later. The proper C++ approach would be using a static method Result::stat() .

offsetof() on the other hand is legal, as the offsetof() macro never actually calls a method or accesses a member, but only performs address calculations.

Lars
A: 

Regarding the statement:

It avoids a null pointer dereference by testing for the existence of this, and it does not try to access class members in the else block. So long as these guards are in place, it's legitimate code, right?

The code is not legitimate. There is no guarantee that the compiler and/or runtime will actually call to the method when the pointer is NULL. The checking in the method is of no help because you can't assume that the method will actually end up being called with a NULL this pointer.

Andrew Medico
+2  A: 

Everybody else has done a good job of reiterating that the behavior is undefined. But lets pretend it wasn't, and that p->member is allowed to behave in a consistent manner under certain circumstances even if p isn't a valid pointer.

Your second construct would still serve almost no purpose. From a design perspective, you've probably done something wrong if a single function can do its job both with and without accessing members, and if it can then splitting the static portion of the code into a separate, static function would be much more reasonable than expecting your users to create a null pointer to operate on.

From a safety perspective, you've only protected against a small portion of the ways an invalid this pointer can be created. There's uninitialized pointers, for starters:

Result* p;
p->stat(); //Oops, 'this' is some random value

There's pointers that have been initialized, but are still invalid:

Result* p = new Result;
delete p;
p->stat(); //'this' points to "safe" memory, but the data doesn't belong to you

And even if you always initialize your pointers, and absolutely never accidentally reuse free'd memory:

struct Struct {
    int i;
    Result r;
}
int main() {
    ((Struct*)0)->r.stat(); //'this' is likely sizeof(int), not 0
}

So really, even if it weren't undefined behavior, it is worthless behavior.

Dennis Zickefoose