ansaurus

Question

Answer 1

+5 A:

My favourite example of a well-design C API is GTK+ which uses method #2 that you describe.

Although another advantage of your method #1 is not just that you could allocate the object on the stack, but also that you could reuse the same instance multiple times. If that's not going to be a common use case, then the simplicity of #2 is probably an advantage.

Of course, that's just my opinion :)

Dean Harding 2010-07-21 04:30:58

Now, this is a interesting comment. I've heard many people say exactly the opposite, that GTK+ is a terrible API. I've unfortunately only used it a little, I'm usually up in the clouds of C++, and using Gtkmm. My experience remembers ref-counted pointers, and _new and _free functions, however, which seems to match the 3rd option more. I'd be curious as to your reasons to your opinion.

Thanatos 2010-07-21 04:52:03

The general design philosophy of GLib/Gtk seems to be "we won't use C++ on principle, so we'll hand-code all the same stuff". This approach has some advantages in a sense that it's still a pure C API, which makes it easier to use with various C-only FFIs... but from a pure C/C++ perspective, it seems to be rather impractical.

Pavel Minaev 2010-07-21 06:24:47

+1 Mentioning GTK+.If you are accustomed with OOP, GTK seems very natural.

Andrei Ciobanu 2010-07-21 06:26:58

Answer 2

A:

Both are acceptable - there's tradeoffs between them, as you've noted.

There's large real world examples of both - as Dean Harding says, GTK+ uses the second method; OpenSSL is an example that uses the first.

caf 2010-07-21 04:48:38

Answer 3

+5 A:

Another disadvantage of #2 is that the caller doesn't have control over how things are allocated. This can be worked around by providing an API for the client to register his own allocation/deallocation functions (like SDL does), but even that may not be sufficiently fine-grained.

The disadvantage of #1 is that it doesn't work well when output buffers are not fixed-size (e.g. strings). At best, you will then need to provide another function to obtain the length of the buffer first so that the caller can allocate it. At worst, it is simply impossible to do so efficiently (i.e. computing length on a separate path is overly expensive over computing-and-copying in one go).

The advantage of #2 is that it allows you to expose your datatype strictly as an opaque pointer (i.e. declare the struct but don't define it, and use pointers consistently). Then you can change the definition of the struct as you see fit in future versions of your library, while clients remain compatible on binary level. With #1, you have to do it by requiring the client to specify the version inside the struct in some way (e.g. all those cbSize fields in Win32 API), and then manually write code that can handle both older and newer versions of the struct to remain binary-compatible as your library evolves.

In general, if your structs are transparent data which will not change with future minor revision of the library, I'd go with #1. If it is a more or less complicated data object and you want full encapsulation to fool-proof it for future development, go with #2.

Pavel Minaev 2010-07-21 04:49:33

+1 for the point about abstraction and opaque pointers - this is a big advantage as it completely decouples your implementation from the calling code

Paul R 2010-07-21 05:12:46

Answer 4

+3 A:

Both are functionally equivalent. But, in my opinion, method #2 is easier to use. A few reasons for prefering 2 over 1 are:

It is more intuitive. Why should I have to call free on the object after I have (apparently) destroyed it using myStruct_Destroy.
Hides details of myStruct from user. He does not have to worry about it's size, etc.
In method #2, myStruct_init does not have to worry about the initial state of the object.
You don't have to worry about memory leaks from user forgetting to call free.

If your API implementation is being shipped as a separate shared library however, method #2 is a must. To isolate your module from any mismatch in implementations of malloc/new and free/delete across compiler versions you should keep memory allocation and de-allocation to yourself. Note, this is more true of C++ than of C.

Rajorshi 2010-07-21 04:56:11

Both are *not* equivalent, because the latter requires dynamic allocation, and the former does not.

Tom 2010-07-21 05:06:09

Well...yeah. Should have said functionally equivalent. Updated.

Rajorshi 2010-07-21 05:20:29

Answer 5

+1 A:

The problem I have with the first method is not so much that it is longer for the caller, it's that the api now is handcuffed on being able to expand the amount of memory it is using precisely because it doesn't know how the memory it received was alloced. The caller doesn't always know ahead of time how much memory it will need (imagine if you were trying to implement a vector).

Another option you didn't mention, which is going to be overkill most of the time, is to pass in a function pointer that the api uses as an allocator. This doesn't allow you to use the stack, but does allow you to do something like replace the use of malloc with a memory pool, which still keeping the api in control of when it wants to allocate.

As for which method is proper api design, it's done both ways in the C standard library. strdup() and stdio uses the second method while sprintf and strcat use the first method. Personally I prefer the second method (or third) unless 1) I know I will never need to realloc and 2) I expect the lifetime of my objects to be short and thus using the stack is very convienent

edit: There is actually 1 other option, and it is a bad one with a prominent precedent. You could do it the way strtok() does it with statics. Not good, just mentioned for completeness sake.

frankc 2010-07-21 04:59:15

Answer 6

A:

Both ways are ok, I tend to do the first way as a lot of the C I do is for embedded systems and all the memory is either tiny variables on the stack or statically allocated. This way there can be no running out of memory, either you have enough at the beginning or you're screwed from the start. Good to know when you have 2K of Ram :-) So all my libraries are like #1 where the memory is assumed to be allocated.

But this is an edge case of C development.

Having said that, I'd probablly go with #1 still. Perhaps using init and finalize/dispose (rather than destroy) for names.

Keith Nicholas 2010-07-21 05:12:53

Answer 7

+1 A:

That could give some element of reflexion:

case #1 mimick the memory allocation scheme of C++, with more or less the same benefits :

easy allocation of temporaries on stack (or in static arrays or such to write you own struct allocator replacing malloc).
easy free of memory if anything goes wrong in init

case #2 hides more informations on used structure and can also be used for opaque structures, typically when structure as seen by user is not exactly the same as internally used by the lib (say there could be some more fields hidden at the end of structure).

Mixed API between case#1 and case #2 is also common : there is a field used to pass in a pointer to some already initialized structure, if it is null it is allocated (and pointer is always returned). With such API the free is usually responsibility of caller even if init performed allocation.

In most cases I would probably go for case #1.

kriss 2010-07-21 05:17:19

Answer 8

+5 A:

Why not provide both, to get the best of both worlds?

Use _init and _terminate functions to use method #1 (or whatever naming you see fit).

Use additional _create and _destroy functions for the dynamic allocation. Since _init and _terminate already exist, it effectively boils down to:

myStruct *myStruct_create ()
{
    myStruct *s = malloc(sizeof(*s));
    if (s) 
    {
        myStruct_init(s);
    }
    return (s);
}

void myStruct_destroy (myStruct *s)
{
    myStruct_terminate(s);
    free(s);
}

If you want it to be opaque, then make _init and _terminate static and do not expose them in the API, only provide _create and _destroy. If you need other allocations, e.g. with a given callback, provide another set of functions for this, e.g. _createcalled, _destroycalled.

The important thing is to keep track of the allocations, but you have to do this anyway. You must always use the counterpart of the used allocator for deallocation.

Secure 2010-07-21 05:45:31

Answer 9

A:

I would go for (1) with one simple extension, that is to have your _init function always return the pointer to the object. Your pointer initialization then may just read:

myStruct *s = myStruct_init(malloc(sizeof(myStruct)));

As you can see the right hand side then only has a reference to the type and not to the variable anymore. A simple macro then gives you (2) at least partially

#define NEW(T) (T ## _init(malloc(sizeof(T))))

and your pointer initialization reads

myStruct *s = NEW(myStruct);

Jens Gustedt 2010-07-21 06:13:39

How do you handle a malloc failure?

Secure 2010-07-21 06:37:21

@Secure: Good point. I think `_init` functions should be made robust to passing in a `NULL` pointer and just pass this through on return. The check for that is than left to the user of the pointer, as usual.

Jens Gustedt 2010-07-21 06:43:25

The other design philosophy in this regard is that most functions should expect valid pointers (with the obvious exception of deallocators) and assert() them to not being NULL. Which would make your approach to effectively use assert for the program logic, which is a big no-go. It depends on the overall design of your program, for sure, but personally I prefer to be explicit with error handling. I.e. malloc is used separately and tested for validity before anything else is done with the pointer.

Secure 2010-07-21 07:00:05

@Secure: I would tend to just extend the convention to check pointers returned by the macro `NEW`. This is only a slight extension of such a convention since you'd have to check several functions for that already, not only `malloc` but also `realloc` and `calloc` (and maybe others that I forget).

Jens Gustedt 2010-07-21 07:23:32

Answer 10

+1 A:

Method number 2 every time.

Why? because with method number 1 you have to leak implementation details to the caller. The caller has to know at least how big the struct is. You can't change the internal implementation of the object without recompiling any code that uses it.

JeremyP 2010-07-21 10:31:35

ansaurus

tags:

views:

answers:

C API design: Who should allocate?

related questions