tags:

views:

140

answers:

7

Consider following function:

int get_something (int* array, int size);

It's purpose is to fill the passed array[size] with data from external resource (queryes to resource are expensive). The question is what to do, if resource has more elements, than provided array can handle? What is the best approach?

Edit: Current solution added:

Our approach, at the moment is following:

When user calls get_something() first time, with null argument we perform a full Query, allocate data in a cache (which is just a key-value storage) and return a number of items.

When user calls get_something() next time, with properly initialized buffer, we return him data from cache and clear a cache entry.

If user does not call get_something(), timeout occurs and cache for that item gets freed.

If user calls get_something() too late, and data has been cleared, we generate error state, so user knows that he has to repeat the request.

A: 

Can you do a check on how many elements the resource has? If so I'd do that then copy the array to an array as large as the resource.

or perhaps copying the array to an array double its size when you're reaching near the end?

http://www.devx.com/tips/Tip/13291

Aidanc
Yes, I can check but it is as slow as retrieving data. So total performance will decrease by /2
Shaman
+5  A: 

One option is to not modify the array at all and instead return the needed size as the return result. The caller must then call your function again with an array of at least this size.

Mark Byers
This. See the behaviour of, say, fscanf.
DeadMG
Yes and you should add that in order to query what is needed, the caller should set the input parameter array to NULL. The only problem remaining is that the determination of available items is an expensive operation, that will have to be performed twice.
Adrian Regan
@Adrian Regan: Cache the resource.
DeadMG
To avoid performing operations twice we could even retrieve the full result from the underlying resource on first call and keep it in memory. But what to do, if a caller "forgets" to call function again? This could cause numerous memory leaks.
Shaman
A: 

That depends on how your program should handle that situation, I guess.

One approach could be to fill the array to it's maximum, and return the total number of elements which are available. That way the caller could check if he needs to call the function again (see Mark Byers answer).

Logic behind that:

- Creates array with 100 items
- Calls your function and gets 150 returned
- Increases the array size or creates a second one
  and calls your function again with that array
- Repeats that unless the returned item count is
  equal or less the array size
Bobby
Repeating requests is expensive, due to function gets elements from a slow resource.
Shaman
+2  A: 

Use realloc . Reference link .

Andrei Ciobanu
Don't forget to return a pointer to the array, or use int **.
Nathon
And what if they allocate a buffer on the stack?
JeremyP
@JeremyP In my opinion that would be a poor design choice. On the stack you shouldn't keep arrays with "variable size".
Andrei Ciobanu
@Andrei Ciobanu: The array doesn't have variable size. According to the API caller tells callee what the size of the array is. That leads to the expectation that where the memory for the array came from is none of callee's damned business. If the callee is going to realloc memory, it'd be better for the callee to take complete control and allocates in the first place.
JeremyP
+1  A: 

allocate array dynamically i.e using malloc() and then, in the function, either use realloc() or free the previous list and allocate another, fill it and return the new size. For the second approach you can use the return value for returning new size but to update the callers address of array you will need to change the function to accept int** instead of int*

binW
What will happen, if caller will initialize his array not will malloc()?
Shaman
Bad things will happen.
Nathon
if array is not dynamic i.e if its global, static or automatic then there will be a problem, as you will not be able to free it.
binW
+2  A: 

My choice would be to use the same model as fread() and turn the interface into a stream of sorts.

i.e.

  • either fill the buffer up or put all the items in it and return the number of items actually read
  • maintain some sort of state so that subsequent calls only get unread items
  • return 0 once all the items have been read
  • return a negative number if an error occurs.
JeremyP
+1  A: 

Ok, your basic requirement is to Query a resource, and cache the returned data in memory, to avoid multiple accesses.

That means you will have to allocate memory within your program to store all of the data.

Problem #1 is to populate that cache. I will assume that you have that figured out, and there is some function get_resource();

problem #2 is how to design an api to allow client/user code to interact with that data.

In your example you you are using an array allocated by the client, as the cache, hoping to solve both problems with 1 buffer, but this doesn't solve the problem in all cases ( hence your posting ). So you really need to separate the 2 problems.

Model number #1 is to provide iterator / cursor functionality

iterator = get_something();  // Triggers caching of data from Resource
data = get_next_single_something( iterator );
status = release_something( iterator );

// The logic to release the data could be done automagically in get_next, 
// after returning the last single_something, if that is always the use case.

Model #2 is to return the Whole object in a malloced buffer, and let the client manage the whole thing

data_type *pData=NULL;
unsigned size = get_something( &pData ); // Triggers caching of data from Resource
process( size, pData );
free( pData );
pData=NULL;

Model #3. If you are married to the client array, you can use Model #1 to return multiple values at once, but if there are more values, then get_something() will have to build a cache, and the client will still have to iterate.

jdu.sg
I think this is the best answer, but our approach differs, to fit some other requirements, not listed here.
Shaman
Yes, I expect that - I had to make some interpretations of your requirements, Also it is difficult to go into all of the variations that impact a design. You have to start by fixing one well defined problem, then address additional requirements, and refine the solution to accommodate that.
jdu.sg