views:

373

answers:

3

I just got an interesting problem to take care of, and I see no neat way to solve it.

I have two base data structures that represents a complex graph, declared something like this:

typedef struct _node_t node_t;
typedef struct _graph_t graph_t;

struct {
    /* Data fields omitted */
    node_t * pNextByLevel;
    node_t * pNextByProximity;
    node_t * pNextByRank;
} node_t;

struct {
    /* Data fields omitted */
    size_t nNodes;
    size_t nMaxNodes;
    node_t * pFirstByLevel;
    node_t * pFirstByProximity;
    node_t * pFirstByRank;
} graph_t;

The actual nodes are laid out immediately after the header, so a "graph_t" is normally created with

graph_t * pNewBuffer = calloc(1, sizeof(graph_t) + nMaxNodes * sizeof(node_t));
pNewBuffer->nMaxNodes = nMaxNodes;

and the "raw" array of nodes is accessed with

node_t * pNewBufferNodes = (node_t *) &pNewBuffer[1];

Now, there is a support function that operates on a buffer that reduces the number of nodes. It looks something like this:

status_t reduce(graph_t** ppBuffer)
{
    graph_t * pReplacement, * pOld = *ppBuffer;
    size_t nRequired; 
    node_t * oldBuffer = (node_t *) &pOld[1];

    /* complex calculation ultimately computes 'nRequired' */

    pReplacement = realloc(pOld, sizeof(graph_t) + nRequired * sizeof(node_t));

    if ( pReplacement != pOld )
    {
        int i;
        node_t * newBuffer = (node_t *) &pReplacement[1];
        ptrdiff_t offset = newBuffer - oldBuffer;

        for ( i = 0; i < requiredNodes; i++ )
        {
            newBuffer[i].pFirstByLevel += offset;
            newBuffer[i].pFirstBySimilarity += offset;
            newBuffer[i].pFirstByRank += offset;
        }
        *ppBuffer = pReplacement;
    }
}

Now, this has worked beautifully for a long time. Any errors in the above comes from the fact that I'm writing from memory, I'm just trying to explain the idea.

What baffles me right now is that when using the reduction function from a new module, the input is not "properly" aligned. When I examine the addresses, I note the following properties:

 ((char *) newBuffer - (char *) oldBuffer) % sizeof(graph_t) == 0
 ((size_t) newBuffer) % sizeof(node_t) == 0
 ((size_t) oldBuffer) % sizeof(node_t) == 0
 ((char *) newBuffer - (char *) oldBuffer) % sizeof(node_t) == sizeof(node_t) / 2

which, of course, causes a bit of problem since the "offset" value becomes incorrect, but it's not so obvious since all other use of the data structures work (there is no "real" alignment problem).

Which boils down to my question - Do you see a neat way of incrementing the pointers when the offset can not be expressed as a whole number of elements?

Bonus points for finding a way that does not resort to excessive casting :)

+2  A: 

On ptrdiff_t : "This is the type returned by the subtraction operation between two pointers. This is a signed integral type, and as such can be casted to compatible fundamental data types. A subtraction of two pointers is only granted to have a valid defined value for pointers to elements of the same array (or for the element just past the last in the array). For other values, the behavior depends on the system characteristics and compiler implementation."

As you use realloc, you are not in this case. So your offset will not be an int. That explain your problem.

The no bonus points solution is to cast your pointers into char* to compute the offset.You will end up with an offset in bytes. you can then add the byte offset using casts. To minimize the casting you can write a helper function that set the correct value to your node pointers.

If you want to use realloc, I do not see another solution, as you initial array was freed by realloc. The byte offset seems the only way.

You can calloc your reduced array, copy the nodes, then free the old array. But you lose the realloc advantage when reallocation is done in place.

Other solutions force you to change your data structure. You can allocate your nodes independantly with malloc and the reduction is simplier. You just have to free the nodes you do not need anymore. That seems the cleanest way, but you have to refactor ...

I hope it helps. Tell me if I have misunderstood...

neuro
Ah, but of course! You're quite correct about the ptrdiff_t problem.
Christoffer
+1  A: 

If you don't want to cast:

newBuffer[i].pFirstByLevel = newBuffer[i].pFirstByLevel - oldBuffer + newBuffer;            
newBuffer[i].pFirstBySimilarity = newBuffer[i].pFirstBySimilarity - oldBuffer + newBuffer;            
newBuffer[i].pFirstByRank = newBuffer[i].pFirstByRank - oldBuffer + newBuffer;
Dingo
+1  A: 

Your syntax is messed up. The structure tag name goes before the structure definition; anything after is a declaration.

Either

typedef struct _node_t {
    /* Data fields omitted */
    node_t * pNextByLevel;
    node_t * pNextByProximity;
    node_t * pNextByRank;
} node_t;

or

typedef struct _graph_t graph_t;
struct _graph_t {
    /* Data fields omitted */
    size_t nNodes;
    size_t nMaxNodes;
    node_t * pFirstByLevel;
    node_t * pFirstByProximity;
    node_t * pFirstByRank;
};

would be what you meant to write.


This is a somewhat common workaround, but requires some restructuring of your existing code.

/* same node_t as before */
typedef struct _node_t {...} node_t;
/* same graph_t as before */
typedef struct _graph_header_t {...} graph_header_t;
/* new type */
typedef struct _graph_t {
    graph_header_t header;
    node_t nodes[1];
} graph_t;

graph_t pNewBuffer = calloc(1, sizeof(graph_t) + (nMaxNodes-1) * sizeof(node_t));

It permits access to pNewBuffer->nodes[i] for 0 <= i < nMaxNodes, no casting required anywhere.

Now, this would be nicer if you could declare node_t nodes[0], avoiding the off-by-one when calculating allocation sizes, but even though some compilers are happy with it, I don't believe that it's accepted by the standard.

C99 introduces "flexible array members"

typedef struct _graph_t {
    graph_header_t header;
    node_t nodes[];
} graph_t;

which is pretty much the same thing, but defined by an actual standard. Some exceptions: flexible array members can only be placed at the end of a structure, and sizeof(pNewBuffer->nodes) is invalid (though GCC returns 0). Otherwise, sizeof(graph_t) is equal to whatever the size would be if there were the node_t[] array had zero elements.

ephemient
Yes, I'd prefer to use flexible C arrays, but this particular module needs to be C89 :(I don't get you comment about typedefs though, in reality these are declared in separate files (typedefs in header, struct declarations in C file)
Christoffer
I'm just commenting that your real code must say `struct _foo_t {...}` and not `struct {...} _foo_t` as you've written here.
ephemient
Ah, you're right.
Christoffer