views:

2295

answers:

8

I'm trying to create a function which takes an array as an argument, adds values to it (increasing its size if necessary) and returns the count of items. So far I have:

int main(int argc, char** argv) {
    int mSize = 10;
    ent a[mSize];
    int n;
    n = addValues(a,mSize);

    for(i=0;i<n;i++) {
       //Print values from a
    }
}

int addValues(ent *a, int mSize) {
    int size = mSize;

    i = 0;

    while(....) { //Loop to add items to array
        if(i>=size-1) { 
            size = size*2;
            a = realloc(a, (size)*sizeof(ent));
        }
        //Add to array
        i++;
    }
    return i;
}

This works if mSize is large enough to hold all the potential elements of the array, but if it needs resizing, I get a Segmentation Fault.

I have also tried:

int main(int argc, char** argv) {
    ...
    ent *a;
    ...
}

int addValues(ent *a, int mSize) {
    ...
    a = calloc(1, sizeof(ent);
    //usual loop
    ...
}

To no avail.

I assume this is because when I call realloc, the copy of 'a' is pointed elsewhere - how is it possible to modify this so that 'a' always points to the same location?

Am I going about this correctly? Are there better ways to deal with dynamic structures in C? Should I be implementing a linked list to deal with these?

+6  A: 

Try reworking it so a pointer to a pointer to the array is passed in, i.e. ent **a. Then you will be able to update the caller on the new location of the array.

xahtep
You need to combine this with tgamblin's answer to get the full solution.
Mark Ransom
+11  A: 

The main problem here is that you're trying to use realloc with a stack-allocated array. You have:

ent a[mSize];

That's automatic allocation on the stack. If you wanted to use realloc() on this later, you would create the array on the heap using malloc(), like this:

ent *a = (ent*)malloc(mSize * sizeof(ent));

So that the malloc library (and thus realloc(), etc.) knows about your array. From the looks of this, you may be confusing C99 variable-length arrays with true dynamic arrays, so be sure you understand the difference there before trying to fix this.

Really, though, if you are writing dynamic arrays in C, you should try to use OOP-ish design to encapsulate information about your arrays and hide it from the user. You want to consolidate information (e.g. pointer and size) about your array into a struct and operations (e.g. allocation, adding elements, removing elements, freeing, etc.) into special functions that work with your struct. So you might have:

typedef struct dynarray {
   elt *data;
   int size;
} dynarray;

And you might define some functions to work with dynarrays:

// malloc a dynarray and its data and returns a pointer to the dynarray    
dynarray *dynarray_create();     

// add an element to dynarray and adjust its size if necessary
void dynarray_add_elt(dynarray *arr, elt value);

// return a particular element in the dynarray
elt dynarray_get_elt(dynarray *arr, int index);

// free the dynarray and its data.
void dynarray_free(dynarray *arr);

This way the user doesn't have to remember exactly how to allocate things or what size the array is currently. Hope that gets you started.

tgamblin
Thanks for this! Very useful info - I'm going to see if I can rework my code as you and Javier suggest.If I have multiple dynamic arrays of different types (An array of 'ents', an array of 'foos' etc.), would it be possible to create a single set of methods to deal with them all?
Tom
You could do this with templates in C++, but there's not a great way to deal with it in C. You could do it through a set of well-crafted macros, but that doesn't make for the prettiest or most maintainable code.
tgamblin
You can have one set of methods for different types, but in order to do so, you need to use void* pointers. You'll also need to know the size of each element, and perform lots of casting between void* and the actual type. This can be very dangerous and error-prone.
Adam Rosenfield
You can try storing the size of array element as part of dynarray structure. You will still need a lot of casts, though
Arkadiy
+1  A: 

You are passing the array pointer by value. What this means is:

int main(int argc, char** argv) {
    ...
    ent *a; // This...
    ...
}

int addValues(ent *a, int mSize) {
    ...
    a = calloc(1, sizeof(ent); // ...is not the same as this
    //usual loop
    ...
}

so changing the value of a in the addValues function does not change the value of a in main. To change the value of a in main you need to pass a reference to it to addValues. At the moment, the value of a is being copied and passed to addValues. To pass a reference to a use:

int addValues (int **a, int mSize)

and call it like:

int main(int argc, char** argv) {
    ...
    ent *a; // This...
    ...
    addValues (&a, mSize);
}

In the addValues, access the elements of a like this:

(*a)[element]

and reallocate the array like this:

(*a) = calloc (...);

Skizz

Skizz
+1  A: 

this is a nice reason to use OOP. yes, you can do OOP on C, and it even looks nice if done correctly.

in this simple case you don't need inheritance nor polymorphism, just the encapsulation and methods concepts:

  • define a structure with a length and a data pointer. maybe an element size.
  • write getter/setter functions that operate on pointers to that struct.
  • the 'grow' function modifies the data pointer within the struct, but any struct pointer stays valid.
Javier
A: 

Xahtep explains how your caller can deal with the fact that realloc() might move the array to a new location. As long as you do this, you should be fine.

realloc() might get expensive if you start working with large arrays. That's when it's time to start thinking of using other data structures -- a linked list, a binary tree, etc.

slim
The original code uses the common practice of doubling the size of the array on each realloc, so I don't think it will be too expensive.
Mark Ransom
In which case each realloc is twice as expensive as the last - but assuming a constant rate of increase, took double the time to become necessary. The cost is in the implicit memcpy() to the new location.
slim
Sure, but it's amortised constant time. The average case to add to a vector of pointers is a ptr write, increment the size, and amortize 1-2 ptrs worth of memcpy. To add to a doubly-linked list, it's 1 alloc off a fast heap, 1 ptr for the payload, and 4 for the cross-links. That's more work.
Steve Jessop
Especially since memcpy on most architectures is considerably faster than the same number of non-sequential pointer writes. I wouldn't set much store by a claim that exponential arrays are slow without profiling, as long as elements are only added/removed at the end.
Steve Jessop
I agree - never trust any efficiency claims without profiling.
slim
A: 

As stated you should pass pointer to pointer to update the pointer value.
But I would suggest redesign and avoid this technique, in most cases it can and should be avoided. Without knowing what exactly you trying to achieve it's hard to suggest alternative design, but I'm 99% sure that it's doable other way. And as Javier sad - think object oriented and you will always get better code.

Ilya
+1  A: 

If you changed the variable declaration in main to be

ent *a = NULL;

the code would work more like you envisioned by not freeing a stack-allocated array. Setting a to NULL works because realloc treats this as if the user called malloc(size). Keep in mind that with this change, the prototype to addValue needs to change to

int addValues(ent **a, int mSize)

and that the code needs to handle the case of realloc failing. For example

while(....) { //Loop to add items to array
    tmp = realloc(*a, size*sizeof(ent));
    if (tmp) {
        *a = tmp;
    } else {
        // allocation failed. either free *a or keep *a and
        // return an error
    }
    //Add to array
    i++;
}

I would expect that most implementations of realloc will internally allocate twice as much memory if the current buffer needs resizing making the original code's

size = size * 2;

unnecessary.

ctuffli
A: 

Are you really required to use C? This would be a great application of C++'s "std::vector", which is precisely a dynamically-sized array (easily resizeble with a single call you don't have to write and debug yourself).

Larry Gritz