views:

409

answers:

10

Hi,

I will begin to use C for an Operating Systems course soon and I'm reading up on best practices on using C so that headaches are reduced later on.

This has always been one my first questions regarding arrays as they are easy to screw up.

Is it a common practice out there to bundle an array and its associated variable containing it's length in a struct?

I've never seen it in books and usually they always keep the two separate or use something like sizeof(array[]/array[1]) kind of deal.

But with wrapping the two into a struct, you'd be able to pass the struct both by value and by reference which you can't really do with arrays unless using pointers, in which case you have to again keep track of the array length.

I am beginning to use C so the above could be horribly wrong, I am still a student.

Cheers, Kai.

+1  A: 

I don't see anything wrong with doing that but I think the reason that that is not usually done is because of the overhead incurred by such a structure. Most C is bare-metal code for performance reasons and as such, abstractions are often avoided.

Andrew Hare
Hmmm...is that not a bit of generalisation?
Steve Melnikoff
+3  A: 

Sure, you can do that. Not sure if I'd call it a best practice, but it's certainly a good idea to make C's rather rudimentary arrays a bit more manageable. If you need dynamic arrays, it's almost a requirement to group the various fields needed to do the bookkeeping together.

Sometimes you have two sizes in that case: one current, and one allocated. This is a tradeoff where you trade fewer allocations for some speed, paying with a bit of memory overhead.

Many times arrays are only used locally, and are of static size, which is why the sizeof operator is so handy to determine the number of elements. Your syntax is slightly off with that, by the way, here's how it usually looks:

int array[4711];
int i;

for(i = 0; i < sizeof array / sizeof *array; i++)
{
  /* Do stuff with each element. */
}

Remember that sizeof is not a function, the parenthesis are not always needed.

EDIT: One real-world example of a wrapping exactly as that which you describe is the GArray type provided by glib. The user-visible part of the declaration is exactly what you describe:

typedef struct {
  gchar *data;
  guint len;
} GArray;

Programs are expected to use the provided API to access the array whenever possible, not poke these fields directly.

unwind
+6  A: 

Yes this is a great practice to have in C. It's completely logical to wrap related values into a containing structure.

I would go ever further. It would also serve your purpose to never modify these values directly. Instead write functions which act on these values as a pair inside your struct to change length and alter data. That way you can add invariant checks and also make it very easy to test.

JaredPar
A: 

I've never seen it done that way, but I haven't done OS level work in over a decade... :-) Seems like a reasonable approach at first glance. Only concern would be to make sure that the size somehow stays accurate... Calculating as needed doesn't have that concern.

Brian Knoblauch
+1  A: 

I haven't seen it done in books much either, but I've been doing the same the same thing for a while now. It just seems to make sense to "package" those things together. I find it especially useful if you need to return an allocated array from a method for instance.

Eric Petroelje
A: 

considering you can calculate the length of the array (in bytes, that is, not # of elements) and the compiler will replace the sizeof() calls with the actual value (its not a function, calls to it are replaced by the compiler with the value it 'returns'), then the only reason you'd want to wrap it in a struct is for readability.

It isn't a common practice, if you did it, someone looking at your code would assume the length field was some special value, and not just the array size. That's a danger lurking in your proposal, so you'd have to be careful to comment it properly.

gbjbaanb
The problem with relying on sizeof is that is that it will not help with dynamically allocated arrays.
Michael Burr
...or when you pass the array to a function with a pointer.
Judge Maygarden
+3  A: 

There are three ways.

  1. For static array (not dynamically allocated and not passed as pointer) size is knows at compile time so you can used sizeof operator, like this: sizeof(array)/sizeof(array[0])
  2. Use terminator (special value for last array element which cannot be used as regular array value), like null-terminated strings
  3. Use separate value, either as a struct member or independent variable. It doesn't really matter because all the standard functions that work with arrays take separate size variable, however joining the array pointer and size into one struct will increase code readability. I suggest to use to have a cleaner interface for your own functions. Please note that if you pass your struct by value, called function will be able to change the array, but not the size variable, so passing struct pointer would be a better option.
qrdl
on the last point about changing the array if passing by value. Do you mean you can change the array itself or the items pointed at? (If it's an array of pointer.) Otherwise, I don't see how you can change one but not the other.
mmccoo
Id struct contains pointer to array and array size (as I've mentioned), changing the array won't be a problem, because array pointer will be passed by value, not array itself
qrdl
A: 

If you use static arrays you have access to the size of array using sizeof operator. If you'll put it into struct, you can pass it to function by value, reference and pointer. Passing argument by reference and by pointer is the same on assembly level (I'm almost sure of it).

But if you use dynamic arrays, you don't know the size of array at compile time. So you can store this value in struct, but you will also store only a pointer to array in structure:

struct Foo {
  int *myarray;
  int size;
};

So you can pass this structure by value, but what you realy do is passing pointer to int (pointer to array) and int (size of array).

In my opinion it won't help you much. The only thing that is in plus, is that you store the size and the array in one place and it is easy to get the size of the array. If you will use a lot of dynamic arrays you can do it this way. But if you will use few arrays, easier will be not to use structures.

klew
ANSI C does not support passing variables by reference. Perhaps you are thinking of C++?
Judge Maygarden
I was thinking about C, but it was a long time ago when I was writing in C, so it's my mistake.
klew
+1  A: 

For public API I'd go with the array and the size value separated. That's how it is handled in most (if not all) c library I know. How you handle it internally it's completely up to you. So using a structure plus some helper functions/macros that do the tricky parts for you is a good idea. It's always making me head-ache to re-think how to insert an item or to remove one, so it's a good source of problems. Solving that once and generic, helps you getting bugs from the beginning low. A nice implementation for dynamic and generic arrays is kvec.

quinmars
+1  A: 

I'd say it's a good practice. In fact, it's good enough that in C++ they've put it into the standard library and called it vector. Whenever you talk about arrays in a C++ forum, you'll get inundated with responses that say to use vector instead.

Michael Burr