views:

155

answers:

6

I have done far more C++ programming than "plain old C" programming. One thing I sorely miss when programming in plain C is type-safe generic data structures, which are provided in C++ via templates.

For sake of concreteness, consider a generic singly linked list. In C++, it is a simple matter to define your own template class, and then instantiate it for the types you need.

In C, I can think of a few ways of implementing a generic singly linked list:

  1. Write the linked list type(s) and supporting procedures once, using void pointers to go around the type system.
  2. Write preprocessor macros taking the necessary type names, etc, to generate a type-specific version of the data structure and supporting procedures.
  3. Use a more sophisticated, stand-alone tool to generate the code for the types you need.

I don't like option 1, as it is subverts the type system, and would likely have worse performance than a specialized type-specific implementation. Using a uniform representation of the data structure for all types, and casting to/from void pointers, so far as I can see, necessitates an indirection that would be avoided by an implementation specialized for the element type.

Option 2 doesn't require any extra tools, but it feels somewhat clunky, and could give bad compiler errors when used improperly.

Option 3 could give better compiler error messages than option 2, as the specialized data structure code would reside in expanded form that could be opened in an editor and inspected by the programmer (as opposed to code generated by preprocessor macros). However, this option is the most heavyweight, a sort of "poor-man's templates". I have used this approach before, using a simple sed script to specialize a "templated" version of some C code.

I would like to program my future "low-level" projects in C rather than C++, but have been frightened by the thought of rewriting common data structures for each specific type.

What experience do people have with this issue? Are there good libraries of generic data structures and algorithms in C that do not go with Option 1 (i.e. casting to and from void pointers, which sacrifices type safety and adds a level of indirection)?

+1  A: 

Your option 1 is what most old time c programmers would go for, possibly salted with a little of 2 to cut down on the repetitive typing, and just maybe employing a few function pointers for a flavor of polymorphism.

dmckee
I am really turned off by option 1, as one gives up the help of the type system, and also because it adds a level of indirection.
Bradford Larsen
@Bradford: If you're paranoid about typing, C is not the right language for you in the first place. C's type system is extremely simple and won't offer you a lot of help, and that's basically all there is to it. That's how it's designed.
Chuck
+4  A: 

Option 1 is the approach taken by most C implementations of generic containers that I see. The Windows driver kit and the Linuz kernel use a macro to allow links for the containers to be embedded anwhere in a structure, with the macro used to obtain the structure pointer from a pointer to the link field:

Option 2 is the tack taken by BSD's tree.h and queue.h container implementation:

I don't think I'd consider either of these approaches type safe. Useful, but not type safe.

Michael Burr
+1 I like the Linux `list.h` approach best.
Chris Lutz
with all the kernel only stuff taken out:http://ccan.ozlabs.org/browse/ccan/listCCAN is full of that kind of of thing. http://ccan.ozlabs.org/there's some stuff for helping with type safety there as well
Spudd86
A: 

Option 1, either using void * or some union based variant is what most C programs use, and it may give you BETTER performance than the C++/macro style of having multiple implementations for different types, as it has less code duplication, and thus less icache pressure and fewer icache misses.

Chris Dodd
+1  A: 

There's a common variation to option 1 which is more efficient as it uses unions to store the values in the list nodes, ie there's no additional indirection. This has the downside that the list only accepts values of certain types and potentially wastes some memory if the types are of different sizes.

However, it's possible to get rid of the union by using flexible array member instead if you're willing to break strict aliasing. C99 example code:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct ll_node
{
    struct ll_node *next;
    long long data[]; // use `long long` for alignment
};

extern struct ll_node *ll_unshift(
    struct ll_node *head, size_t size, void *value);

extern void *ll_get(struct ll_node *head, size_t index);

#define ll_unshift_value(LIST, TYPE, ...) \
    ll_unshift((LIST), sizeof (TYPE), &(TYPE){ __VA_ARGS__ })

#define ll_get_value(LIST, INDEX, TYPE) \
    (*(TYPE *)ll_get((LIST), (INDEX)))

struct ll_node *ll_unshift(struct ll_node *head, size_t size, void *value)
{
    struct ll_node *node = malloc(sizeof *node + size);
    if(!node) assert(!"PANIC");

    memcpy(node->data, value, size);
    node->next = head;

    return node;
}

void *ll_get(struct ll_node *head, size_t index)
{
    struct ll_node *current = head;
    while(current && index--)
        current = current->next;
    return current ? current->data : NULL;
}

int main(void)
{
    struct ll_node *head = NULL;
    head = ll_unshift_value(head, int, 1);
    head = ll_unshift_value(head, int, 2);
    head = ll_unshift_value(head, int, 3);

    printf("%i\n", ll_get_value(head, 0, int));
    printf("%i\n", ll_get_value(head, 1, int));
    printf("%i\n", ll_get_value(head, 2, int));

    return 0;
}
Christoph
A: 

GLib is has a bunch of generic data structures in it, http://www.gtk.org/

CCAN has a bunch of useful snippets and such http://ccan.ozlabs.org/

Spudd86
A: 

I would like to program my future "low-level" projects in C rather than C++...

Why? Does your target lack a C++ compiler or C++ runtime?

John
I'm turned off by the complexity of C++. For my purposes, I will not be writing entire applications in C or C++, but core algorithmic code. C tends to be easier to call from other languages, is much simpler than C++, and makes it more apparent when memory allocation is being done.Although, using C rather than C++, I will miss namespaces, scoped destructors, and generic programming. But I am hoping the benefits of simplicity and understandability outweigh the loss of those features.
Bradford Larsen