views:

337

answers:

6

I've written an API that requires a context to be initialized and thereafter passed into every API call. The caller allocates the memory for the context, and then passes it to the init function with other parameters that describe how they want later API calls to behave. The context is opaque, so the client can't really muck around in there; it's only intended for the internal use of the API functions.

The problem I'm running into is that callers are allocating the context, but not initializing it. As a result, subsequent API functions are referring to meaningless garbage as if it was a real context.

I'm looking for a way to verify that the context passed into an API function has actually been initialized. I'm not sure if this is possible. Two ideas I've thought of are:

  1. Use a pre-defined constant and store it in a "magic" field of the context to be verified at API invocation time.
  2. Use a checksum of the contents of the context, storing this in the "magic" field and verifying it at invocation time.

Unfortunately I know that either one of these options could result in a false positive verification, either because random crap in memory matches the "magic" number, or because the context happens to occupy the same space as a previously initialized context. I think the latter scenario is more likely.

Does this simply boil down to a question of probability? That I can avoid false positives in most cases, but not all? Is it worth using a system that merely gives me a reasonable probability of accuracy, or would this just make debugging other problems more difficult?

A: 

To sidestep the issue of a memory location of a previous context being reused, you could, in addition to freeing the context, reset it and remove the "magic" number, assuming of course that the user frees the context using your API. That way when the system returns that same block of memory for the next context request, the magic number check will fail.

codelogic
+6  A: 

Best solution, I think, is add create()/delete() functions to your API and use create to allocate and initialize the structure. You can put a signature at the start of the structure to verify that the pointer you are passed points to memory allocated with create() and use delete() to overwrite the signature (or entire buffer) before freeing the memory.

You can't actually avoid false positives in C because the caller malloc'd memory that "happened" to start with your signature; but make you signature reasonably long (say 8 bytes) and the odds are low. Taking allocation out of the hands of the caller by providing a create() function will go a long way, though.

And, yeah, your biggest risk is that an initialized buffer is free'd without using delete(), and a subsequent malloc happens to reuse that memory block.

Software Monkey
+1  A: 

You could define a new API call that takes uninitialised memory and initialises it in whatever way you need. Then, part of the client API is that the client must call the context initialisation function, otherwise undefined behaviour will result.

Greg Hewgill
A: 

see what your system does with uninitialzed menmory. m$ does: http://stackoverflow.com/questions/65724/uninitialized-memory-blocks-in-vc

Ray Tayek
This is only true for debug builds - you cannot rely on this for release code
1800 INFORMATION
+3  A: 

Your context variable is probably at the moment some kind of pointer to allocated memory. Instead of this, make it a token or handle that can be explicitly verified. Every time a context is initialised, you return a new token (not the actual context object) and store that token in an internal list. Then, when a client gives you a context later on, you check it is valid by looking in the list. If it is, the token can then be converted to the actual context and used, otherwise an error is returned.

typedef Context long;

typedef std::map<Context, InternalContext> Contexts;
Contexts _contexts;

Context nextContext()
{
  static Context next=0;
  return next++;
}

Context initialise()
{
  Context c=nextContext();
  _contexts.insert(make_pair(c, new InternalContext));
  return c;
}

void doSomethingWithContext(Context c)
{
  Contexts::iterator it=_ _contexts.find(c);
  if (it==_contexts.end())
    throw "invalid context";
  // otherwise do stuff with the valid context variable
  InternalContext *internalContext=*it.second;
}

With this method, there is no risk of an invalid memory access as you will only correctly use valid context references.

1800 INFORMATION
+3  A: 

Look at the paper by Matt Bishop on Robust Programming. The use of tickets or tokens (similar to file handles in some respects, but also including a nonce - number used once) allows your library code ensure that the token it is using is valid. In fact, you allocate the data structure on behalf of the user, and pass back to the user a ticket which must be provided for each call to the API you define.

I have some code based closely on that system. The header includes the comments:

/*
** Based on the tickets in qlib.c by Matt Bishop ([email protected]) in
** Robust Programming.  Google terms: Bishop Robust Nonce.
** http://nob.cs.ucdavis.edu/~bishop/secprog/robust.pdf
** http://nob.cs.ucdavis.edu/classes/ecs153-1998-04/robust.html
*/

I also built an arena-based memory allocation system using tickets to identify different arenas.

Jonathan Leffler