tags:

views:

61

answers:

3

I'm writing a Unicode library for C as a personal exercise. Without the actual code, I have this:

typedef struct string {
    unsigned long length;
    unsigned *data;
} string;

// really simple stuff

string *upush(string *s, unsigned c) { ... }

// UTF-8 conversion

string *ctou(char *old) { ... }

char *utoc(string *old) { ... }

upush pushes the full Unicode code point c onto the end of s, reallocating as needed.

I can use the class like this, for example:

string a;
upush(&a, 0x44);
puts(utoc(&a));

Instead of having to use & to pass the reference to a function, is there a way I can write so that I can call:

string a;
upush(a, 0x44);
puts(utoc(a));

Or, do you think it's just better to typedef string as a struct string pointer, not the struct itself, and always deal with struct string pointers?

Resolution? Thanks to Dummy00001, this is now my library's general profile:

typedef struct string {
    unsigned long length;
    unsigned *data;
} string;

// really simple stuff

string *upush(string *s, unsigned c) { ... }

// UTF-8 conversions

string ctou(char *old) { ... }

char *utoc(string old) { ... }

Here are some examples:

string a = ctou("Hello, world");
upush(&a, '!');
puts(utoc(a));

I also learn, thanks to Doches, that in C++ (which is what I'll write a wrapper for this library in), you can write the prototype like this to do implicit pass-by-reference:

string *upush(string &s, unsigned c);

For those who are interested, here's my ongoing progress: http://delan.ath.cx/ulib

+1  A: 

string a allocates the memory you need in the current scope, string *a is only a pointer into memory, which you have to malloc and free for yourself.

so the answer: depends. If you use the pointer strategy you should provide create- and destroy-functions, and keep in mind, that users "forget" to use these functions.

Peter Miehle
Delan Azabani
But that is the normal way to do it in C.
Peter Miehle
+2  A: 

Instead of having to use & to pass the reference to a function, is there a way I can write so that I can call:
string a;
upush(a, 0x44);
puts(utoc(a));

I personally prefer the & notation and use it (even in C++) to differentiate between the case of function which might modify its argument and the case of function accessing the struct as a const. IOW:

string a;  
upush(&a, 0x44);  // pass pointer, we modify the "a"
puts(utoc(a));    // pass "a" as a copy, utoc() doesn't modify it

In code above one sees immediately that upush() takes &a and thus potentially modifies it. While the utoc() takes the a and doesn't modify it.

But the final decision how to make the interface largely depends on how you are going to manage the memory.

Or, do you think it's just better to typedef string as a struct string pointer, not the struct itself, and always deal with struct string pointers?

Make the symbol string being a pointer to an opaque structure, defined only inside the library itself. In public interface header:

struct string_s;
typedef struct string_s *string;

In internal header:

struct string_s {
   unsigned long length;
   unsigned *data;
};

Interface functions would only accept the string which is pointer. Provide functions to create new string (calloc(1,sizeof(string *))) and to destroy it (free()). That way you would provide an interface to application which is consistent (developer doesn't have to think about what is pointer what is not) and is also resilient to the library internal changes (since the struct's structure only known inside the library itself).

Downside is that one need to use dynamic memory management to allocate the puny struct: malloc()/free() are fast, but it still fells like an overkill.

Dummy00001
It's possible to avoid doubling the number of heap allocations performed by the library by using the variable sized struct idiom. However, this would change the location of the `string` struct whenever its size changes. Functions like `upush()` would either have to return new strings or use an additional level of indirection (e.g. `upush(string **s, unsigned c)`).
bk1e
+1  A: 

Unfortunately, the example you give is the right way to do pass-by-reference in C. I think what you're after is the & notation for specifying that an argument should be passed by reference rather than value, which is a C++-ism (Longer, better example from IBM).

In C++ (not C!) you can write:

typedef struct _foo_t {
  int value;
} foo;

void modify_foo(foo &f)
{
  f.value = -1;
}

and do exactly what you're asking, e.g.:

foo aFoo;
aFoo.value = 1;
modify_foo(aFoo);
printf("%d\n",aFoo.value);
Doches