ansaurus

Question

How many bits to ignore when checking for NULL?

Answer 1

+12 A:

There is no portable way to test for any invalid pointer except NULL. Evaluating &ip[3] gives undefined behaviour, before you do anything with it; the only solution is to test for NULL before doing any arithmetic on the pointer.

If you don't need portability, and don't need to guarantee that you catch all errors, then on most mainstream platforms you could check whether the address is within the first page of memory; it's common to define NULL to be address zero, and to reserve the first page to trap most null pointer dereferences. On a POSIX platform, this would look something like

static size_t page_size = sysconf(_SC_PAGESIZE);
assert(reinterpret_cast<intptr_t>(i) >= page_size);

But this isn't a complete solution. The only real solution is to fix whatever is abusing null pointers in the first place.

Mike Seymour 2010-09-15 17:38:18

Good, point, but useless if I only have access to `Set`. Also, virtually all systems use `NULL==0` and make that seg-v by not mapping one or more pages at the start of the address space.

BCS 2010-09-15 17:41:58

Mike Seymour 2010-09-15 17:51:23

@BCS: if you'd researched the issue enough to justify making such a bold assertion, then you wouldn't have a question left to ask. What you're doing is one of the most spectacularly dodgy things I've seen for quite a while, and I've seen plenty. :-) Why don't you just find an OS specific service that returns a predictable result for invalid pointers? For example - on Linux madvise() provides such a facility.

Tony 2010-09-15 17:51:52

@Tony, if I can catch most cases in a few ops, I'd rather not be using a system call for sanity checking. And unless you can point out a system (that I care about) where it doesn't work, I don't care if it's not portable.

BCS 2010-09-15 18:01:14

@BCS: Updated to include a non-portable partial solution.

Mike Seymour 2010-09-15 18:15:30

The first step in fixing what is passing off bad pointers is to *find* what is passing off bad pointers, thus my question. (I seem to recall once having a bug where the only common functions to the code-path that set the bad pointer and the function that tripped on it was main and a function it called. With static constructors there may be *no* common code. At that point grep/printf or the like are your friends.)

BCS 2010-09-15 18:23:08

@BCS: Sorry, I got the impression you were trying to work around the problem, not diagnose it. Why not just let it segfault, and catch that in your debugger?

Mike Seymour 2010-09-15 18:29:07

Because there is zero guarantee that the seg-v will happen in a way that gives any information about where the pad pointer came from. I.e. because that's to late.

BCS 2010-09-15 18:35:46

@BCS: OK, I hadn't noticed your changes to the code.

Mike Seymour 2010-09-15 18:44:54

Answer 2

A:

You're asking for undefined behavior. Your pointer isn't null, but it also almost certainly points to an invalid location. Your best bet is to check for NULL before you do any pointer math with it.

cHao 2010-09-15 17:39:58

I'm not doing any pointer math, the code that calls me is.

BCS 2010-09-15 17:47:22

Answer 3

+2 A:

You shouldn't be doing pointer arithmetic (including array indexing) off of a null pointer at all.

And you should use 0, not NULL in c++. NULL is a feature of c, still supported but not idiomatic in c++.

In regards to the BCS's many comments and the edit. That changes the question from the rather naive one on the surface to a much deeper one. But...it is not going to be easy---in a language as permissive as c++---to protect yourself against people doing stupid things before calling your code.

dmckee 2010-09-15 17:40:28

The main in the question is to show how the bug case can arise.

BCS 2010-09-15 17:45:43

You should really provide either some justification or evidence when making such statements.

Tony 2010-09-15 17:47:00

@Tony: Like [Do you use NULL or 0 (zero) for pointers in C++?](http://www.research.att.com/~bs/bs_faq2.html#null) where the leading answer points to Stroustrup saying he uses 0?

dmckee 2010-09-15 17:53:49

Being able to distinguish between the number 0 and the address 0 is a useful idiom in both C and C++. This is a case where Stroustrup is wrong. It is useful to be able to tell them apart for more then asthetics.

Bill 2010-09-15 17:55:44

@Tony: which statement are you referring to? At this point I count 8 statement I've made.

BCS 2010-09-15 17:58:04

@Bill: And using NULL (which has to be defined as an integral constant expression yielding 0) has led people into error. There is no good solution in C++, so `nullptr` in the next C++ standard will be very, very useful.

David Thornley 2010-09-15 18:13:42

Gah!. Somehow I copied th wrong URL into my comment to Tony above. The SO question is http://stackoverflow.com/questions/176989/do-you-use-null-or-0-zero-for-pointers-in-c, and the link above points to Stroustrup's comments.

dmckee 2010-09-15 18:33:04

Answer 4

+1 A:

Trying to work around undefined behavior will always be very dependant on your platform, compiler, version,etc. if it is at all possible.

Common *nixes never map the first page of the address space precisely to catch null pointer access, thus you might get away with checking if the pointer value is between 0 and 4096 (Or whatever page size your system uses).

But don't do this, you can't guard against everything that can go wrong, focus instead on getting the code right. If somone passes you an invalid pointer, chances are there's something gravely wrong anyway that a pointer validation check can't fix.

nos 2010-09-15 17:47:09

The point of the check is to catch that gravely wrong situation early. With pointers being stored and passed around, there can be an arbitrary separation between where the bad pointer is passed in an where the seg-v happens. Without this type of check, finding where the ptr came from can be very hard.

BCS 2010-09-15 17:51:47

Answer 5

A:

One reason, among many, you cannot do this in a portable fashion is that NULL is not guaranteed to be 0. It is only specified that null pointers will compare equal to 0. You may write a 0 (or the preprocessor macro "NULL") in your code, but the compiler knows that this 0 is in a pointer context so it generates the appropriate code to compare it to a null pointer, whatever the actual implementation of a null pointer is. See here and here for more information on that. Reinterpreting a NULL pointer as an integral type may cause it to have a true value instead of false.

A. Levy 2010-09-15 17:55:59

This answer is correct in c, but incorrect in c++ (which is how the question is tagged (though the code is equally "valid" and a bad idea in either)) where the standard demands that null pointers have a numeric value of 0, and the NULL macro is defined that way.

dmckee 2010-09-15 18:28:46

@dmckee: The null pointer constant is written as a constant integral zero expression in a pointer context (`0` is a popular choice), and a conversion to an integral type yields `0` or `false`, but that says nothing about its bitwise representation. In other words, it doesn't have to be all-bits-zero. `int x = 0; int * ip = reinterpret_cast<int *>(x);` doesn't necessarily leave `ip` as a null pointer constant. It isn't quite clear to me whether a `reinterpret_cast<int>((int *)0)` has to be `0`, but `int * ip = 0; ip++; reinterpret_cast<int>(ip)` doesn't have to be a small integer.

David Thornley 2010-09-15 19:26:49

Answer 6

+1 A:

Is there any way you can exert some influence to get that bad code corrected? There is no possible way this can turn out well. Legally, just creating an invalid pointer is undefined behavior.

If Set is always going to be passed a small offset from ip, and ip will always be initialized to NULL, you are probably going to be OK with what you are doing. Most modern systems do have the null pointer constant as all bits zero, and most will do the natural thing. There is of course absolutely no guarantee that it will work on any given system with any given compiler and any given compiler options, and changing any of those might cause it to fail.

Since any use of bad pointers can cause program failure, you should consider what happens when the code triggers a memory violation.

Also, I don't know what your ASSERT macro does, but assert, in most implementations, is only activated in debug mode. If you want to push this piece of junk into production, or run in optimized mode, you might want to make sure it will still fail more gently.

David Thornley 2010-09-15 18:26:28

Answer 7

+1 A:

If you don't mind a really bad hack, you can force a memory access with volatile (n.b. volatile is evil). According to the GCC docs, volatile accesses must be ordered across sequence points, so you can do something like this:

int test = *(volatile int *)i;
*(volatile int *)i = test;

I don't think = is a sequence point, but the following might also work:

*(volatile int *)i = *(volatile int *)i;

tc. 2010-09-15 18:31:14

In other words, us the pointer early. As long as a seg-v (in the right place) is useful, that would work.

BCS 2010-09-15 18:48:31

Answer 8

+1 A:

I really wouldn't recommend trying to work around a bug in somebody else's code. If you're not running everything you write through a debugger while you're developing code no amount of checks are going to help you catch all the problems. Get them to fix their code.

If you're not using a debugger, get a decent crash handler that dumps the callstack for each thread and as much additional information regarding the program state as possible. Try to figure out what could be going wrong from that.

Regularly running your code through static analysis tools can also help here.

Remember, that it might not be someone forgetting to initialise a pointer, it could be someone else overwriting that pointer through a bad memory write from somewhere completely unrelated. There are tools which can help track down such things too.

Regarding the NULL Vs 0 debate, #define NULL 0 is better for a couple of reasons:

1) You can more easily see when you're dealing with a pointer.

2) Using NULL offers no less or more safety than using 0. So why not make your code more readable?

3) When C++11 is finally released #define NULL nullptr is a lot easier to change than all those zeros. (You could go the other way and #define nullptr 0 today I suppose, but that will probably cause problems in the future if you're developing cross platform code.)

And for the record, the C++ standard explicitly states that a null pointer constant is an rvalue integer type that evaluates to zero. So please let's not have any more nonsense about null pointers not having to equal zero.

James 2010-09-15 18:41:13

Now watch it become c++12, or would that be c++0xb?

BCS 2010-09-15 18:50:46

Good points. See my edits.

BCS 2010-09-15 18:51:35

Answer 9

A:

You'd have to consider your particular operating system and hardware architecture. If you're only interested in detecting pointers that are "close to null" then you could use ASSERT(i > pageSize), assuming that the first page is always write protected in your OS.

But ... the obvious question is: Why bother? The OS will detect the null in this case and SEGV as you pointed out, which is just as good as an ASSERT, isn't it?

Dobes Vandermeer 2010-09-15 18:51:43

A seg-v might be as good as a assert (assuming I use tc.'s answer) but may not be in some cases. For instance, you might not be able to recover a stack trace, you may be unable to throw an exception or any number of other things.

BCS 2010-09-15 19:53:16

ansaurus

tags:

views:

answers:

How many bits to ignore when checking for NULL?

related questions