ansaurus

Question

is the use of varargs in a C API a Good Idea(tm)

Answer 1

+7 A:

Generally, no.

Varargs throws out a lot of type-safety - you could pass pointers, floats, etc., instead of ints and it will compile without issue. Misuse of varargs, such as omitting arguments, can introduce odd crashes due to stack corruption, or reading invalid pointers.

For instance, the following call will compile and result in crashes or other odd behavior:

UpdateField(6, "Field1", 7, "Field2", "Foo");

The initial 6 is how many parameters to expect. It will convert the string pointer "Foo" to an int to put in Field2, and it will try to read and interpret two other parameters that aren't present, which will probably cause a crash here from dereferencing stack noise.

I believe the implementation of varargs in C is a mistake (given today's environment - it probably made perfect sense in 1972.) The implementation is you pass a bunch of values on the stack and then the callee will walk the stack picking up parameters, based on its interpretation of some initial control parameter. This type of implementation basically screams for you to make a mistake in what might be a very difficult to diagnose way. C#'s implementation of this, passing an array of objects with an attribute on the method, is just must saner, albeit not directly mappable into the C language.

Michael 2009-05-14 22:50:30

then pass a big string in a known format. Strings are more serialisation friendly and very portable. You just have to define a format to pass them in - XMl is one (though heavyweight), or just a simple csv would do.

gbjbaanb 2009-05-14 23:02:52

+1 to gbjbaanb - look at (x)printf - the most beautiful implementation of varags known :)

KevinDTimm 2009-05-14 23:13:55

Answer 2

+4 A:

One problem with varargs in C is that you don't know how many arguments are passed, so you need that as another parameter:

update(2, FIELD_NAME1, 10, FIELD_NAME2, 20);

update(3, FIELD_NAME1, 10, FIELD_NAME2, 20, FIELD_NAME3, 30);

Chris Lutz 2009-05-14 22:51:04

Why -1 this? It's a totally valid (and correct) response to the question. The OP has no way of knowing how many arguments are gonna be passed... AND that's a really good example of issues with using varargs.

Jason Coco 2009-05-14 23:52:07

Answer 3

+1 A:

I'd take a long hard look at any "update" functionality designed to be used externally (or even internally) which uses the same function to update many different fields in a structure. Is there a specific reason why you can't have discrete functionality for updating the fields?

McWafflestix 2009-05-14 22:55:16

Answer 4

+3 A:

Why not have one arg, an array. Even better, a pointer to an array.

struct field {
  int val;
  char* name;
};

or even...

union datatype {
  int a;
  char b;
  double c;
  float f;
// etc;
};

then

struct field {
  datatype val;
  char* name;
};

union (struct* field_name_val_pairs, int len);

ok 2 args. I lied, and thought a length param would be good.

ryansstack 2009-05-14 22:57:52

I think this is a good design, except that I would avoid the union (just give val a fixed type). This way, the compiler can help you enforce type safety. Of course, the length still has to be correct.

Matthew Flaschen 2009-05-15 08:10:07

This is good, and update_ary(int n, char **names, datatype* vals) is possible if a bit awkward at times.

dmckee 2009-05-15 13:37:35

Answer 5

+4 A:

I tend to avoid varargs except in one specific circumstance where it's very useful. Variable arguments don't really deliver all that much benefit over and above what can be done by individual function calls, especially in your case.

In terms of readability (and that's usually what I prefer over raw speed except for very specific cases), there's no real difference between the following two options (I've added a count to the varargs versions since you need either a count or sentinel to detect the end of the data):

update(2, FIELD_NAME1, 10, FIELD_NAME2, 20);
update(3, FIELD_NAME3, 10, FIELD_NAME4, 20, FIELD_NAME5, 30);
/* ========== */
update(FIELD_NAME1, 10);
update(FIELD_NAME2, 20);
update(FIELD_NAME3, 10);
update(FIELD_NAME4, 20);
update(FIELD_NAME5, 30);

In fact, as the varargs version gets longer, you'll need to split it up anyway, for formatting:

update(5,
    FIELD_NAME1, 10,
    FIELD_NAME2, 20,
    FIELD_NAME3, 10,
    FIELD_NAME4, 20,
    FIELD_NAME5, 30);

Doing it the "one call per field name" way results in simpler code in the function itself, and does not degrade the readability of the calls. In addition, it allows the compiler to properly detect certain errors that it cannot do for varargs, such as incorrect types or a mismatch between the user-supplied count and the actual count.

If you really must be able to call a single function to do it, I'd opt for:

void update (char *k1, int v1) {
    ...
}
void update2 (char *k1, int v1, char *k2, int v2) {
    update (k1, v1);
    update (k2, v2);
}
void update3 (char *k1, int v1, char *k2, int v2, char *k3, int v3) {
    update (k1, v1); /* or these two could be a single */
    update (k2, v2); /*   update2 (k1, v1, k2, v2);    */
    update (k3, v3);
}
/* and so on. */

You could even do the higher-level functions as macros if you prefer, without losing type safety.

The only place I tend to use varargs functions is when providing the same functionality as printf() - for example, I've occasionally had to write logging libraries with functions such as logPrintf() providing the same functionality. I can't think of any other time in my long (and I mean, long :-) time at the coalface that I've needed to use it.

As an aside, if you do decide to use varargs, I tend to go for sentinels rather than counts since this prevents mismatches when adding fields. You could easily forget to adjust the count and end up with:

update (2, k1, v1, k2, v2, k3, v3);

when adding, which is insidious because it silently skips k3/v3, or:

update (3, k1, v1, k2, v2);

when deleting, which is almost certainly fatal to the successful running of your program.

Having a sentinel prevents this (as long as you don't forget the sentinel, of course):

update (k1, v1, k2, v2, k3, v3, NULL);

paxdiablo 2009-05-14 23:22:46

Answer 6

+1 A:

A lot of folks here have suggested passing the # of parameters, however others rightly note that this leads to insidious bugs where the # of fields changes but the count passed to the vararg function doesn't. I solve this in a product by using null termination instead:

send_info(INFO_NUMBER,
          Some_Field,       23,
          Some_other_Field, "more data",
          NULL);

That way, when copy and paste programmers inevitably copy it, they're not likely to mess up. And more importantly, I'm not likely to mess it up.

Looking back at the original problem, you have a function that must update a structure with a lot of fields, and the structure will grow. The usual method (in the Win32 and MacOS classic APIs) of passing in data of this sort to a function is by passing in another structure (can even be the same structure as what you're updating), ie:

void update(UPDATESTRUCTURE *update_info);

to use it, you would populate the fields:

UPDATESTRUCTURE my_update = {
    UPDATESTRUCTURE_V1,
    field_1,
    field_2
};
update( &my_update );

Later on when you add new fields, you can update the UPDATESTRUCTURE definition and recompile. By putting in the version #, you can support older code that doesn't use the new fields.

A variant on the theme is to have a value for fields you don't want updated, like KEEP_OLD_VALUE (ideally this will be 0) or NULL.

UPDATESTRUCTURE my_update = {
    field_1,
    NULL,
    field_3
};
update( &my_update);

I don't include a version because I take advantage of the fact when we increase the # of fields in UPDATESTRUCTURE, the extra fields will be initialized to 0, or KEEP_OLD_VALUE.

Paul 2009-05-22 06:22:53

Such a design is not super safe, and your examples contains potential errors. Since NULL might be defined as just "0" which is of type int, and the size of int might be different than the size of pointers, there might be a mismatch. Therefore *ALWAYS* cast to a pointer for NULL arguments. See http://www.lysator.liu.se/c/c-faq/c-1.html.

hlovdal 2009-08-27 11:02:39

That's a good point: in the vararg example, I don't specify the types of Some_Field and Some_other_Field. If they are not implemented as integers, then it would be wise to cast the NULL to the same type.

Paul 2009-08-30 07:43:36

Answer 7

A:

The reasons given thus far for avoiding varargs are all good. Let me add another one that was not given yet, as it is less important, but can be encountered. The vararg mandates that the parameter be passed on the stack, thus slowing the function call. On some architecture the difference can be significative. On x86 it's not very important because of its lack of register, on SPARC, for example, it can be important. Up to 5 parameters are passed on registers and if your function uses few locals, no stack adjustment is made. If your function is a leaf function (i.e. doesn't call another function), there is no window adjustment also. The cost of the call is therefor very small. With a vararg, the normal sequence of passing parameters on the stack, stack adjustement and window management is made or your function wouldn't be able to get at the parameters. This increases the cost of the call significantly.

tristopia 2009-08-06 14:32:27

ansaurus

tags:

views:

answers:

is the use of varargs in a C API a Good Idea(tm)

related questions