views:

165

answers:

6

As part of answering another question, I came across a piece of code like this, which gcc compiles without complaint.

typedef struct {
    struct xyz *z;
} xyz;
int main (void) {
    return 0;
}

This is the means I've always used to construct types that point to themselves (e.g., linked lists) but I've always thought you had to name the struct so you could use self-reference. In other words, you couldn't use xyz *z within the structure because the typedef is not yet complete at that point.

But this particular sample does not name the structure and it still compiles. I thought originally there was some black magic going on in the compiler that automatically translated the above code because the structure and typedef names were the same.

But this little beauty works as well:

typedef struct {
    struct NOTHING_LIKE_xyz *z;
} xyz;

What am I missing here? This seems a clear violation since there is no struct NOTHING_LIKE_xyz type defined anywhere.

When I change it from a pointer to an actual type, I get the expected error:

typedef struct {
    struct NOTHING_LIKE_xyz z;
} xyz;

qqq.c:2: error: field `z' has incomplete type

Also, when I remove the struct, I get an error (parse error before "NOTHING ...).

Is this allowed in ISO C?


Update: A struct NOSUCHTYPE *variable; also compiles so it's not just inside structures where it seems to be valid. I can't find anything in the c99 standard that allows this leniency for structure pointers.

+4  A: 

As the warning says in the second case, struct NOTHING_LIKE_xyz is an incomplete type, like void or arrays of unknown size. An incomplete type can only appear as a type pointed to, with an exception for arrays of unknown size that are allowed as the last member of a struct, making the struct itself an incomplete type in this case. The code that follows cannot dereference any pointer to an incomplete type (for good reason).

Incomplete types can offer some datatype encapsulation of sorts in C... The corresponding paragraph in http://www.ibm.com/developerworks/library/pa-ctypes1/ seems like a good explanation.

Pascal Cuoq
+1 for the explanation and reference to supporting docs.
paxdiablo
+1  A: 

The 1st and 2nd cases are well-defined, because the size and alignment of a pointer is known. The C compiler only needs the size and alignment info to define a struct.

The 3rd case is invalid because the size of that actual struct is unknown.

But beware that for the 1st case to be logical, you need to give a name to the struct:

//             vvv
typedef struct xyz {
    struct xyz *z;
} xyz;

otherwise the outer struct and the *z will be considered two different structs.


The 2nd case has a popular use case known as "opaque pointer" (pimpl). For example, you could define a wrapper struct as

 typedef struct {
    struct X_impl* impl;
 } X;
 // usually just: typedef struct X_impl* X;
 int baz(X x);

in the header, and then in one of the .c,

 #include "header.h"
 struct X_impl {
    int foo;
    int bar[123];
    ...
 };
 int baz(X x) {
    return x.impl->foo;
 }

the advantage is out of that .c, you cannot mess with the internals of the object. It is a kind of encapsulation.

KennyTM
+1 for the "it's actually different structs" info.
paxdiablo
+1  A: 

You do have to name it. In this:

typedef struct {
    struct xyz *z;
} xyz;

will not be able to point to itself as z refers to some complete other type, not to the unnamed struct you just defined. Try this:

int main()
{
    xyz me1;
    xyz me2;
    me1.z = &me2;   // this will not compile
}

You'll get an error about incompatible types.

R Samuel Klatchko
I get a warning out of gcc (c rather than c++) but +1 for pointing out the fact they're actually _different_ types.
paxdiablo
A: 

I was wondering about this too. Turns out that the struct NOTHING_LIKE_xyz * z is forward declaring struct NOTHING_LIKE_xyz. As a convoluted example,

typedef struct {
    struct foo * bar;
    int j;
} foo;

struct foo {
    int i;
};

void foobar(foo * f)
{
    f->bar->i;
    f->bar->j;
}

Here f->bar refers to the type struct foo, not typedef struct { ... } foo. The first line will compile fine, but the second will give an error. Not much use for a linked list implementation then.

Scott Wales
It may be forward declaring, or struct foo may not be defined at all within the compilation unit, in which case it is an incomplete type.
Pascal Cuoq
+3  A: 

The parts of the C99 standard you are after are 6.7.2.3, paragraph 7:

If a type specifier of the form struct-or-union identifier occurs other than as part of one of the above forms, and no other declaration of the identifier as a tag is visible, then it declares an incomplete structure or union type, and declares the identifier as the tag of that type.

...and 6.2.5 paragraph 22:

A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or union tag with its defining content later in the same scope.

caf
That's what I wanted to see, being an anal-retentive language lawyer :-) Although it's para8 in my copy (but I've got the one updated to TC3 so that may explain that).
paxdiablo
+1  A: 

Well... All I can say is that your previous assumption was incorrect. Every time you use a struct X construct (by itself, or as a part of larger declaration), it is interpreted as a declaration of a struct type with a struct tag X. It could be a re-declaration of a previously declared struct type. Or, it can be a very first declaration of a new struct type. The new tag is declared in scope in which it appears. In your specific example it happens to be a file scope (since C language has no "class scope", as it would be in C++).

The more interesting example of this behavior is when the declaration appears in function prototype:

void foo(struct X *p); // assuming `struct X` has not been declared before

In this case the new struct X declaration has function-prototype scope, which ends at the end of the prototype. If you declare a file-scope struct X later

struct X;

and try to pass a pointer of struct X type to the above function, the compiler will give you a diagnostics about non-matching pointer type

struct X *p = 0;
foo(p); // different pointer types for argument and parameter

This also immediately means that in the following declarations

void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);

each struct X declaration is a declaration of a different type, each local to its own function prototype scope.

But if you pre-declare struct X as in

struct X;
void foo(struct X *p);
void bar(struct X *p);
void baz(struct X *p);

all struct X references in all function prototype will refer to the same previosly declared struct X type.

AndreyT