views:

212

answers:

9

The following crashes with a seg-V:

// my code
int* ipt;
int bool set = false;
void Set(int* i) {
  ASSERT(i);
  ipt = i;
  set = true;
}

int Get() {
  return set ? *ipt : 0;
}

// code that I don't control.
struct S { int I, int J; }
int main() {
  S* ip = NULL;
  // code that, as a bug, forgets to set ip...
  Set(&ip->J);
  // gobs of code
  return Get();
}

This is because while i is not NULL it still isn't valid. The same problem can happen if the calling code takes the address of an array index operation from a NULL pointer.

One solution to this is to trim the low order bits:

void Set(int* i) {
  ASSERT((reinterpret_cast<size_t>(i))>>10);
  ipt = i;
  set = true;
}

But how many bits should/can I get rid of?


Edit, I'm not worried about undefined behavior as I'll be aborting (but more cleanly than a seg-v) on that case anyway.

FWIW: this is a semi-hypothetical situation. The bug that caused me to think of this was fixed before I posted, but I've run into it before and am thinking of how to work with it in the future.

Things that can be assumed for the sake of argument:

  • If Set is called with something that will seg-v, that's a bug
  • Set may be called by code that isn't my job to fix. (E.g. I file a bug)
  • Set may be called by code I'm trying to fix. (E.g. I'm adding sanity checks as part of my debuggin work.)
  • Get my be called in a way that provide no information about where Set was called. (I.e. allowing Get to seg-v isn't an effective way to debug anything.)
  • The code needn't be portable or catch 100% of bad pointers. It need only work on my current system often enough to let me find where things are going wrong.
+12  A: 

There is no portable way to test for any invalid pointer except NULL. Evaluating &ip[3] gives undefined behaviour, before you do anything with it; the only solution is to test for NULL before doing any arithmetic on the pointer.

If you don't need portability, and don't need to guarantee that you catch all errors, then on most mainstream platforms you could check whether the address is within the first page of memory; it's common to define NULL to be address zero, and to reserve the first page to trap most null pointer dereferences. On a POSIX platform, this would look something like

static size_t page_size = sysconf(_SC_PAGESIZE);
assert(reinterpret_cast<intptr_t>(i) >= page_size);

But this isn't a complete solution. The only real solution is to fix whatever is abusing null pointers in the first place.

Mike Seymour
Good, point, but useless if I only have access to `Set`. Also, virtually all systems use `NULL==0` and make that seg-v by not mapping one or more pages at the start of the address space.
BCS
Mike Seymour
@BCS: if you'd researched the issue enough to justify making such a bold assertion, then you wouldn't have a question left to ask. What you're doing is one of the most spectacularly dodgy things I've seen for quite a while, and I've seen plenty. :-) Why don't you just find an OS specific service that returns a predictable result for invalid pointers? For example - on Linux madvise() provides such a facility.
Tony
@Tony, if I can catch most cases in a few ops, I'd rather not be using a system call for sanity checking. And unless you can point out a system (that I care about) where it doesn't work, I don't care if it's not portable.
BCS
@BCS: Updated to include a non-portable partial solution.
Mike Seymour
The first step in fixing what is passing off bad pointers is to *find* what is passing off bad pointers, thus my question. (I seem to recall once having a bug where the only common functions to the code-path that set the bad pointer and the function that tripped on it was main and a function it called. With static constructors there may be *no* common code. At that point grep/printf or the like are your friends.)
BCS
@BCS: Sorry, I got the impression you were trying to work around the problem, not diagnose it. Why not just let it segfault, and catch that in your debugger?
Mike Seymour
Because there is zero guarantee that the seg-v will happen in a way that gives any information about where the pad pointer came from. I.e. because that's to late.
BCS
@BCS: OK, I hadn't noticed your changes to the code.
Mike Seymour
A: 

You're asking for undefined behavior. Your pointer isn't null, but it also almost certainly points to an invalid location. Your best bet is to check for NULL before you do any pointer math with it.

cHao
I'm not doing any pointer math, the code that calls me is.
BCS
+2  A: 

You shouldn't be doing pointer arithmetic (including array indexing) off of a null pointer at all.

And you should use 0, not NULL in c++. NULL is a feature of c, still supported but not idiomatic in c++.


In regards to the BCS's many comments and the edit. That changes the question from the rather naive one on the surface to a much deeper one. But...it is not going to be easy---in a language as permissive as c++---to protect yourself against people doing stupid things before calling your code.

dmckee
The main in the question is to show how the bug case can arise.
BCS
You should really provide either some justification or evidence when making such statements.
Tony
@Tony: Like [Do you use NULL or 0 (zero) for pointers in C++?](http://www.research.att.com/~bs/bs_faq2.html#null) where the leading answer points to Stroustrup saying he uses 0?
dmckee
Being able to distinguish between the number 0 and the address 0 is a useful idiom in both C and C++. This is a case where Stroustrup is wrong. It is useful to be able to tell them apart for more then asthetics.
Bill
@Tony: which statement are you referring to? At this point I count 8 statement I've made.
BCS
@Bill: And using NULL (which has to be defined as an integral constant expression yielding 0) has led people into error. There is no good solution in C++, so `nullptr` in the next C++ standard will be very, very useful.
David Thornley
Gah!. Somehow I copied th wrong URL into my comment to Tony above. The SO question is http://stackoverflow.com/questions/176989/do-you-use-null-or-0-zero-for-pointers-in-c, and the link above points to Stroustrup's comments.
dmckee
+1  A: 

Trying to work around undefined behavior will always be very dependant on your platform, compiler, version,etc. if it is at all possible.

Common *nixes never map the first page of the address space precisely to catch null pointer access, thus you might get away with checking if the pointer value is between 0 and 4096 (Or whatever page size your system uses).

But don't do this, you can't guard against everything that can go wrong, focus instead on getting the code right. If somone passes you an invalid pointer, chances are there's something gravely wrong anyway that a pointer validation check can't fix.

nos
The point of the check is to catch that gravely wrong situation early. With pointers being stored and passed around, there can be an arbitrary separation between where the bad pointer is passed in an where the seg-v happens. Without this type of check, finding where the ptr came from can be very hard.
BCS
A: 

One reason, among many, you cannot do this in a portable fashion is that NULL is not guaranteed to be 0. It is only specified that null pointers will compare equal to 0. You may write a 0 (or the preprocessor macro "NULL") in your code, but the compiler knows that this 0 is in a pointer context so it generates the appropriate code to compare it to a null pointer, whatever the actual implementation of a null pointer is. See here and here for more information on that. Reinterpreting a NULL pointer as an integral type may cause it to have a true value instead of false.

A. Levy
This answer is correct in c, but incorrect in c++ (which is how the question is tagged (though the code is equally "valid" and a bad idea in either)) where the standard demands that null pointers have a numeric value of 0, and the NULL macro is defined that way.
dmckee
@dmckee: The null pointer constant is written as a constant integral zero expression in a pointer context (`0` is a popular choice), and a conversion to an integral type yields `0` or `false`, but that says nothing about its bitwise representation. In other words, it doesn't have to be all-bits-zero. `int x = 0; int * ip = reinterpret_cast<int *>(x);` doesn't necessarily leave `ip` as a null pointer constant. It isn't quite clear to me whether a `reinterpret_cast<int>((int *)0)` has to be `0`, but `int * ip = 0; ip++; reinterpret_cast<int>(ip)` doesn't have to be a small integer.
David Thornley
+1  A: 

Is there any way you can exert some influence to get that bad code corrected? There is no possible way this can turn out well. Legally, just creating an invalid pointer is undefined behavior.

If Set is always going to be passed a small offset from ip, and ip will always be initialized to NULL, you are probably going to be OK with what you are doing. Most modern systems do have the null pointer constant as all bits zero, and most will do the natural thing. There is of course absolutely no guarantee that it will work on any given system with any given compiler and any given compiler options, and changing any of those might cause it to fail.

Since any use of bad pointers can cause program failure, you should consider what happens when the code triggers a memory violation.

Also, I don't know what your ASSERT macro does, but assert, in most implementations, is only activated in debug mode. If you want to push this piece of junk into production, or run in optimized mode, you might want to make sure it will still fail more gently.

David Thornley
+1  A: 

If you don't mind a really bad hack, you can force a memory access with volatile (n.b. volatile is evil). According to the GCC docs, volatile accesses must be ordered across sequence points, so you can do something like this:

int test = *(volatile int *)i;
*(volatile int *)i = test;

I don't think = is a sequence point, but the following might also work:

*(volatile int *)i = *(volatile int *)i;
tc.
In other words, us the pointer early. As long as a seg-v (in the right place) is useful, that would work.
BCS
+1  A: 

I really wouldn't recommend trying to work around a bug in somebody else's code. If you're not running everything you write through a debugger while you're developing code no amount of checks are going to help you catch all the problems. Get them to fix their code.

If you're not using a debugger, get a decent crash handler that dumps the callstack for each thread and as much additional information regarding the program state as possible. Try to figure out what could be going wrong from that.

Regularly running your code through static analysis tools can also help here.

Remember, that it might not be someone forgetting to initialise a pointer, it could be someone else overwriting that pointer through a bad memory write from somewhere completely unrelated. There are tools which can help track down such things too.

Regarding the NULL Vs 0 debate, #define NULL 0 is better for a couple of reasons:

1) You can more easily see when you're dealing with a pointer.

2) Using NULL offers no less or more safety than using 0. So why not make your code more readable?

3) When C++11 is finally released #define NULL nullptr is a lot easier to change than all those zeros. (You could go the other way and #define nullptr 0 today I suppose, but that will probably cause problems in the future if you're developing cross platform code.)

And for the record, the C++ standard explicitly states that a null pointer constant is an rvalue integer type that evaluates to zero. So please let's not have any more nonsense about null pointers not having to equal zero.

James
Now watch it become c++12, or would that be c++0xb?
BCS
Good points. See my edits.
BCS
A: 

You'd have to consider your particular operating system and hardware architecture. If you're only interested in detecting pointers that are "close to null" then you could use ASSERT(i > pageSize), assuming that the first page is always write protected in your OS.

But ... the obvious question is: Why bother? The OS will detect the null in this case and SEGV as you pointed out, which is just as good as an ASSERT, isn't it?

Dobes Vandermeer
A seg-v might be as good as a assert (assuming I use tc.'s answer) but may not be in some cases. For instance, you might not be able to recover a stack trace, you may be unable to throw an exception or any number of other things.
BCS