views:

170

answers:

7

I need to write my own memory allocation functions for the GMP library, since the default functions call abort() and leave no way I can think of to restore program flow after that occurs (I have calls to mpz_init all over the place, and how to handle the failure changes based upon what happened around that call). However, the documentation requires that the value the function returns to not be NULL.

Is there at least one range of addresses that can always be guaranteed to be invalid? It would be useful to know them all, so I could use different addresses for different error codes, or possibly even different ranges for different families of errors.

+2  A: 

No, there isn't a portable range of invalid pointer values.

You could use platform-specific definitions, or you could use the addresses of some global objects:

const void *const error_out_of_bounds = &error_out_of_bounds;
const void *const error_no_sprockets = &error_no_sprockets;

[Edit: sorry, missed that you were hoping to return these values to a library. As bdonlan says, you can't do that. Even if you find some "invalid" values, the library won't be expecting them. It is a requirement that your function must return a valid value, or abort.]

You could do something like this in globals:

void (*error_handler)(void*);
void *error_data;

Then in your code:

error_handler = some_handler;
error_data = &some_data;
mpz_init(something);

In your allocator:

if (allocated_memory_ok) return the_memory;
error_handler(error_data);
abort();

Setting up the error handler and data before calling mzp_init might be somewhat tedious, but depending how different the behaviour is in different cases, you might be able to write some function or macro to deal with it.

What you can't do, though, is recover and carry on running if the GMP library isn't designed to cope after an allocation fails. You're at the mercy of your tools in that respect - if the library call doesn't return on error, then who knows what broken state its internals will be left in.

But that's a fully general view, whereas GMP is open source. You can find out what actually happens in mpz_init, at least for a particular release of GMP. There might be some way to ensure in advance that your allocator has enough memory to satisfy the request(s), or there might be some way to wriggle out without doing too much damage (like bdonlon says, a longjmp).

Steve Jessop
+3  A: 

If the default memory allocation functions abort(), and GMP's code can't deal with a NULL, then GMP is likely not prepared to deal with the possibility of memory allocation failures at all. If you return a deliberately invalid address, GMP's probably going to try to dereference it, and promptly crash, which is just as bad as calling abort(). Worse, even, because the stacktrace won't point at what's really causing the problem.

As such, if you're going to return at all, you must return a valid pointer, one which isn't being used by anything else.

Now, one slightly evil option would be to use setjmp() and longjmp() to exit the GMP routines. However, this will leave GMP in an unpredictable state - you should assume that you can never call a GMP routine again after this point. It will also likely result in memory leaks... but that's probably the least of your concerns at this point.

Another option is to have a reserved pool in the system malloc - that is, at application startup:

emergencyMemory = malloc(bignumber);

Now if malloc() fails, you do free(emergencyMemory), and, hopefully, you have enough room to recover. Keep in mind that this only gives you a finite amount of headroom - you have to hope GMP will return to your code (and that code will check and see that the emergency pool has been used) before you truly run out of memory.

You can, of course, also use these two methods in combination - first use the reserved pool and try to recover, and if that fails, longjmp() out, display an error message (if you can), and terminate gracefully.

bdonlan
A: 

Only garanteed on current main stream operating systems (with enabled virtual memory) and CPU architectures:

-1L (means all bits on in a value large enough for a pointer)

This is used by a lot of libraries to mark pointers which are freed. With this you can find out easily if the error cames from using a NULL pointer or a hanging reference.

Works on HP-UX, Windows, Solaris, AIX, Linux, Free-Net-OpenBSD and with i386, amd64, ia64, parisc, sparc and powerpc.

Think this works enough. Don't see any reason for more then this two values (0,-1)

Lothar
A: 

If you only return e.g. 16-bit or 32-bit aligned pointers, an uneven pointer-address (LSB equal to 1) will be at least "mysterious", and would create an opportunity for using my all-time favorite bogus-value 0xDEADBEEF (for 32-bit pointers) or 0xDEADBEEFBADF00D (for 64-bit pointers).

S.C. Madsen
A: 

There are several ranges you can use, they are operating system and architecture specific.

Typically most platforms will reserve the first page (usually 4K bytes in length), to catch dereferencing of null pointers (plus room for a slight offset).

You can also point to the reserved operating system pages, on Linux these occupy the region from 0xc0000000 to 0xffffffff (on a 32 bit system). From userspace you won't have necessary privileges to access this region.

Another option (if you want to allocate several such values, is to allocate a page without read or write permissions using mmap or equivalent, and use offsets into this page for each distinct error value.

The simplest solution, is just to use either values immediately negative to 0, (-1, -2, etc.), or immediately positive (1, 2, ...). You can be very certain these addresses are on inaccessible pages.

Matt Joiner
How is returning an invalid pointer better than how the GMP built-in routines handle OOMs - that is, `abort()`ing? GMP's going to dereference whatever you return and crash either way.
bdonlan
@bdonlan: I don't understand. I've provided means to obtain unique pointers guaranteed to be invalid, which was the question.
Matt Joiner
No Matt, you can't point at OS pages. Processes have addresses in virtual memory that don't have anything in common with the addresses that the kernel sees (and reserves). So in the virtual memory of the application process, there is no such thing like reserved pages.
Jens Gustedt
Jens Gustedt
@Matt, read the OP's entire question - he's looking for a way to return a value _from a GMP allocator routine_ and detect the error later. GMP's going to immediately dereference that pointer and crash, so the reserved value will never end up somewhere you can check for it.
bdonlan
@Jens Gustedt: On the contrary, a region of virtual memory *is* reserved for the kernel, dereferencing these in user mode will cause errors (windows I believe has a 2G/2G split, as opposed to the 3G/1G split I mention in my answer). Take a look here: http://kerneltrap.org/node/2450 if you're not familiar with the concept. I like your suggested combined solution. Combined with the fact that `sizeof(char)` is always `1`, its a very parsimonious.
Matt Joiner
@Matt: ah, I missed that you restricted your statement for 32bit systems. Do you see something similar for 64 bit systems?
Jens Gustedt
@Jens Gustedt: Yes, amd64 bit linux currently uses negative addresses with a slight offset for kernel space mappings, and 48 bit addresses. something like `0xffff80000000` upwards.
Matt Joiner
A: 

A possibility is to take C library addresses that are guaranteed to exist and that thus will never be returned by malloc or similar. To be most portable this should be object pointers and not function pointers, but casting ((void*)main) would probably be ok on most architectures. One data pointer that comes to my mind is environ, but which is POSIX, or stdin etc which are not guaranteed to be "real" variables.

To use this you could just use the following:

extern char** environ; /* guaranteed to exist in POSIX */
#define DEADBEAF ((void*)&environ)
Jens Gustedt
A: 

Since nobody has provided the correct answer, the set of non-NULL memory addresses you can safely use as error values is the same as the set of addresses you create for this purpose. Simply declare a static const char (or global const char if you need it to be globally visible) array whose size N is the number of error codes you need, and use pointers to the N elements of this array as the N error values.

If your pointer type is not char * but something else, you may need to use an object of that type instead of a char array, since converting these char pointers into another pointer type is not guaranteed to work.

R..
What's with the -1?
R..
Your answer is certainly an interesting approach, and a very clean concept, but I don't see how my answer is wrong.
Matt Joiner
@Matt: see my comment to your answer.
Jens Gustedt
R.: no idea about the -1, somebody seems to be severe with us. For your solution you don't need to produce pointers to other type of objects, you just always have to cast to `void*`, because comparison then is well defined. Your approach has the disadvantage of creating new library symbols (that must exist, then) and the advantage of being able to create as much as you want. Mine is a bit more restricted (you'd have to find some more symbols if you need them) but doesn't add any new symbol to the code.
Jens Gustedt
My approach does not have to create new symbols. You can make the objects static and expose their addressed via an existing interface or a new one. Or, if you don't need them except in a single translation unit (which might be likely, especially if you're using these sentinel pointers as values for a "pointer to opaque data" that you pass into an API function that uses a callback or something) then there's no need to expose them at all.
R..