tags:

views:

108

answers:

7
+2  Q: 

C++ class ordering

I'm starting to play around with C++, coming from C and Objective C (and a bit of Java). I thought a good place to start building my skills is by writing a simple hash table from scratch, using linked lists for collisions. So I started out by writing the skeletons for each class.

class HashTable
{
   public:
     ...
   private:
     ...
};

class LinkedList
{
   public:
     ...
   private:
     Node *root;
};

class Node
{
  public:
    Node *next;
    string key;
    int value;
    Node()
    {
      ...
    }
};

The weird thing about this is, and this may not come as any surprise to c++ users, that this code wouldn't work. I would get an error like:

error: expected type-specifier before ‘Node’

with respect to the root node in LinkedList class.

When I simply reordered the classes so that it was Node{...}; LinkedList{...}; HashTable{...}; everything worked like a well oiled ice cream truck.

Now, I'm not one to question the design of C++, but is there any reason for this limitation? If I remember correctly, Obj. C's class's are essentially turned into tables and looked up on the fly. So what's reason for this behavior?

+4  A: 

The compiler throws the following error

error: expected type-specifier before ‘Node’

because it (the compiler) does not (yet) know

Node *root;

what Node is. (Since the Node is defined later.)

Two possible solutions:

  • Put the definition of Node class before LinkedList class (you already know this)

  • Forward declare the class Node before class LinkedList by putting this line

    class Node;

    This tells compiler that there exists a class Node.

After reading PigBen's comment, it seems you are questioning the rationale for this behavior. I am not a compiler person, but I think that this behavior makes it easy for parsing. To me, it is similar to having a function declaration available before its use.

PS: Nitpick, for LinkedList, a variable name head may be more suitable than root.

ArunSaha
If I'm reading his question correctly, I think he already understands this. His question is more about why the language is designed this way.
PigBen
A: 

You've declared that your LinkedList class has a type of Node but the compiler doesn't know what a Node is because its yet to be declared.

Just declare Node before LinkedList

Val
+3  A: 

The reason for this behavior is historical. The file is processed sequentially. At the time it comes across the first reference to an identifier, that identifier needs to have already been declared.

The compiler does not process the whole file first.

Instead of re-ordering the class definitions, you can often get away with a forward declaration

class Node;

class List
{
    public:
    //...
    private:
    Node *root;
    //...
};

//...
Andrew Shelansky
A: 

This style means that the parser can run through the code fewer times. If you must identify every declared type, then run through the code again, you spend extra time parsing with the second run through the file.

Of course, as so many have pointed out, you can use a forward declaration.

JoshD
+1  A: 

Think about it the other way around; if it's reasonable to accept a class declared anywhere in the file as OK, why not a class declared in another file that has yet to be encountered?

If you go that far, then you end up not being able to give an error until you try to link the program, which may be far away from where the problem actually occurs.

Colen
-1 IMHO, I don't think this is a valid argument. C++ requires everything used to be defined in the one file (the preprocessor handles all #includes and so isn't part of the actual c++ compilation step, hence everything ends up in one virtual file) and this can be seen as a fairly logical condition as there is nothing to tell the compiler what other files are being compiled into the one binary (furthermore, sometimes they aren't compiled into a executable/library and the "object" file is distributed instead). (Continued...)
Grant Peters
(...continued) On the other hand, the compiler knows about all the data in the single virtual file (as that is all its given) and so its not too unreasonable to assume that a modern compiler would be able to do a scan first to discover all types and namespaces and build up its information in a couple of passes. The only problem I can see with adding this to the standard would be that it would probably force a complete rewrite of most compilers, which just isn't feasible due to the 1000s of man hours that have gone into optimizations and development of said compilers.
Grant Peters
@colen c# (and java) copes perfectly well with having classes defined in any order and in any file. There is nothing intrinsically impossible about it at all; you just have to make several passes
pm100
Oh, it's not that it's impossible, but like Grant Peters said, it's not how C++ works and would take a major change to make it so.
Colen
+2  A: 

There is no technical reason for this limitation - that is proven by the fact that compiler do what you evidently expect within the context of a single class. Still, removing the "limitation" does complicate compilers further, slow them down, increase their memory usage, and crucially - would not be backwards compatible (as matches in a more localised scope (namespace) would presumably be selected over other symbols seen earlier).

IMHO, it also makes code harder to read and understand. Being able to read from top to bottom and comprehend the code as you go is very useful, and encourages more thoughtful and structured expression of your problem solution.

Tony
+4  A: 

The requirement for declarations of this sort comes from two forces. The first is that it simplifies compiler design. Since types and variables have the same identifier structure, the compiler must know which it is encountering whenever it does parse an identifier. There are two ways to do this. One way would be to require that every identifier be declared before it may be used in other definitions. This means that the code must forward declare any name it intends to use before giving its definition. This is a very easy way to write a compiler with an otherwise ambiguous grammar.

The other way to do this is to handle it in multiple passes. Any time an undeclared identifier is encountered, it is skipped, and the compiler tries to resolve it once it's parsed the whole file. It turns out that the grammar of C++ makes this very difficult to do correctly. Compiler writers didn't want to have to go to this trouble, and so we have forward declarations.

The other reason is that you may actually want to have forward declarations so that recursive structures are determinite as an intrinsic property of the language. This is a bit more subtle. Suppose you had written a mutually recursive class network:

class Bar; // forward declaration
class Foo {
    Bar myBar;
};

class Bar {
    int occupySpace;
    Foo myFoo;
};

This is obviously impossible, because the occupySpace member would appear in an infinitely nested recursion. requiring that a forward declaration of all members in a definition provides a specific amount of information for this. In particular, it allows the compiler enough information to form a reference to a class, but not to instantiate the class (because it's size is not known). The forward declarations make this a feature of the syntax of the language, much like how lvalues are assignable as a feature of the language syntax rather than a more subtle semantic or run-time requirement.

TokenMacGuy