views:

485

answers:

4

I'm a moderately experienced Java / C# programmer, and I've recently started learning C++. The problem is, I'm having trouble understanding how to structure the various header and code files. This seems mostly due to my lack of understanding as to how the compiler links everything together. I've tried reading some textbooks, but my preconceptions are heavily colored by my Java and C# knowledge. For example, I'm having a hard time coming to grips with the fact that methods and the like can be defined in a namespace rather than only in a class definition.

I've found plenty of C++ -> Java/C# guides, but practically nothing to go the other way. Are there any good resources out there for easing the Java/C# -> C++ transition, particularly with respect to understanding the compiling process?

+4  A: 

This is something that confused me when I first started using C as well. Books don't do a good job of describing the correct use of headers vs. code files.

The compiler works by loading each .cpp file and compiling it independent of all the others. The first step in compilation is to load all of the headers referred to by #include statements. You can think of it doing a textual insert of the whole foo.h wherever there is a #include "foo.h".

What are the implications of this for how to structure your files? Header files should have whatever parts of the program are needed for other .cpp files to refer to. As a general rule, implementations should not be in header files. This will cause problems. Header files should include declarations of classes, functions, and global variables (if you must use them).

Steve Rowe
""Header files should include declarations of classes""well in fact, you *define* classes and *declare* its member functions in header files.However, *defining* those member functions/static members is done in source files. while, class foo; is a decleration, class foo{} is definition.
Comptrol
+4  A: 

The C++ FAQ is an excellent resource about all the idiosyncrasies of C++, but it's probably a little more advanced than you're looking for -- most of the questions (not just the answers) are mysteries even to fairly experienced C++ developers.

I think if you google for C++ tutorials, you'll be able to find something. You may also want to try learning assembly language (or at least getting a quick introduction as to how things actually happen in a microprocessor), as both C and C++ are quite close to the hardware in the way they do things. This is where their speed and power comes from, but it comes at the price of some of the nicer abstractions Java offers.

I can try to answer your specific questions asked above, but I don't know how well I'll do.

One of the keys to understanding the relationship between header files and cpp files is understanding the idea of a "translation unit". A Java class file can be considered a translation unit as it is the basic unit that is compiled into a binary form. In C++, pretty much every cpp file is a translation unit (there are exceptions if you're doing weird stuff).

A header file may be included in multiple translation units (and must be included everywhere that uses whatever is defined in the header). The #include directive literally just does a text substitution -- the contents of the included file are inserted verbatim where the #include directive is. You normally want your class interface to be defined in the header file, and the implementation in the cpp file. This is because you don't want to be exposing your implementation details to other translation units that may include the header. In C++, everything, including classes, aren't really rich objects, but just chunks of memory that the compiler assigns meaning to... by compiling the same header information into each translation unit, the compiler guarantees that all the translation units have the same understanding of what a chunk of memory represents. Because of the lack of rich data after compile time, things like reflection are impossible.

The second step in the C++ build process is linking, which is where the linker takes all the compiled translation units and looks for symbols (usually function calls, but also variables) used in a translation unit but not defined there. It then looks for another translation unit that defines that symbol and "links" them together, so that all calls to a particular function are directed to the translation unit that defines it.

In the case of class methods, they must be called through a class instance, which is behind the scenes just a pointer to a piece of memory. When the compiler sees these types of method calls, it outputs code that calls a function, implicitly passing the pointer, known as the this pointer, to the function as the first argument. You can have functions that do not belong to classes (not methods, as you said, because a method is properly a member function of a class and thus cannot exist without a class) because the linker has no concept of a class. It will see a translation unit that defines a function and another that calls a function and tie them together.

That ended up being a lot longer than I expected, and of course is an oversimplification, but it is accurate to the best of my knowledge and the level of detail provided... hope it helps some. At least it should give you a starting point for some googling.

rmeador
+1  A: 

I would actually recommend staying away from explanations about C++ compilers and looking at explanations at C compilers. In my experience these are explained better and avoid confusing you with OOP issues. Look for material about C separate compilation. I would have referred you to a great slide booklet from my alma mater, but it's not in English.

The main difference between C compilation and Java/C# is that the compilation does not create a resolved entity. In other words, when you compile in Java, the compiler looks for already-compiled class files for any referenced classes, and makes sure that everything is available and consistent. The underlying assumption is that when you eventually run the program, those files would also be available.

A compiled C file, on the other hand, is just a "promise". It relies on a declaration of what the dependencies would look like (in the form of function declarations), but there are no guarantees that these are defined anywhere. The most difficult mindset switch that you need to do is to think of a C file not just as that file, but rather as the aggregation of that file with everything that it includes (i.e., what the preprocessor generates). In other words, the compiler does not see header files, it seems one large file. The compiler keeps track in the generated object file of everything that "is still missing". Later on, at link time, the linker makes settles this by trying to fill all the blanks with materials from the different object files.

Uri
+1  A: 

you might want to know why compilation and linking are seperate as well (since i dont see any posts explaining it, and it is the cause of a lot of confusion not knowing the underlying reasons of things).

Linking and compiling is completed seperately because (and there might be more than one reason) of the need to do library calls. if you defined or any of its ilk, the code implementing the function prototypes in those headers are part of the libary that is already compiled and sitting as object code somewhere. if a giant compilation process were to be used instead, you'd need to have the source for those library calls, as well as more time during compile because you'd be also compiling the library code.

Chii