I'm writing a dynamically-typed language. Currently, my objects are represented in this way:
struct Class { struct Class* class; struct Object* (*get)(struct Object*,struct Object*); };
struct Integer { struct Class* class; int value; };
struct Object { struct Class* class; };
struct String { struct Class* class; size_t length; char* characters; };
The goal is that I should be able to pass everything around as a struct Object*
and then discover the type of the object by comparing the class
attribute. For example, to cast an integer for use I would simply do the following (assume that integer
is of type struct Class*
):
struct Object* foo = bar();
// increment foo
if(foo->class == integer)
((struct Integer*)foo)->value++;
else
handleTypeError();
The problem is that, as far as I know, the C standard makes no promises about how structures are stored. On my platform this works. But on another platform struct String
might store value
before class
and when I accessed foo->class
in the above I would actually be accessing foo->value
, which is obviously bad. Portability is a big goal here.
There are alternatives to this approach:
struct Object
{
struct Class* class;
union Value
{
struct Class c;
int i;
struct String s;
} value;
};
The problem here is that the union uses up as much space as the size of the largest thing that can be stored in the union. Given that some of my types are many times as large as my other types, this would mean that my small types (int
) would take up as much space as my large types (map
) which is an unacceptable tradeoff.
struct Object
{
struct Class* class;
void* value;
};
This creates a level of redirection that will slow things down. Speed is a goal here.
The final alternative is to pass around void*
s and manage the internals of the structure myself. For example, to implement the type test mentioned above:
void* foo = bar();
// increment foo
if(*((struct Class*) foo) == integer)
(*((int*)(foo + sizeof(struct Class*))))++;
else
handleTypeError();
This gives me everything I want (portability, different sizes for different types, etc.) but has at least two downsides:
- Hideous, error-prone C. The code above only calculates a single-member offset; it will get much worse with types more complex than integers. I might be able to alleviate this a bit using macros, but this will be painful no matter what.
- Since there is no
struct
that represents the object, I don't have the option of stack allocations (at least without implementing my own stack on the heap).
Basically, my question is, how can I get what I want without paying for it? Is there a way to be portable, have variance in size for different types, not use redirection, and keep my code pretty?
EDIT: This is the best response I've ever received for an SO question. Choosing an answer was hard. SO only allows me to choose one answer so I chose the one that lead me to my solution, but you all received upvotes.