tags:

views:

93

answers:

3

Hi,

I was trying out some c++ code while working with classes and this question occurred to me and it's bugging me a little.

I have created a header file that contains my class definition and a cpp file that contains the implementation.

If I use this class in a different cpp file, why am I including the header file instead of the cpp file that contains the class implementations?

If I include the class implementation file, then the class header file should be imported automatically right (since i've already included the header file in the implementation file)? Isn't this more natural?

Sorry if this is a dumb question, i'm genuinely interested in knowing why most people include .h instead of .cpp files when the latter seems more natural (I know python somewhat, maybe that's why it seems natural to me atleast). Is it just historical or is there a technical reason concerning program organisation or maybe something else?

+12  A: 

Because when you're compiling another file, C++ doesn't actually need to know about the implementation. It only needs to know the signature of each function (which paramters it takes and what it returns), the name of each class, what macros are #defined, and other "summary" information like that, so that it can check that you're using functions and classes correctly. The contents of different .cpp files don't get put together until the linker runs.

For example, say you have foo.h

int foo(int a, float b);

and foo.cpp

#include "foo.h"
int foo(int a, float b) { /* implementation */ }

and bar.cpp

#include "foo.h"
int bar(void) {
    int c = foo(1, 2.1);
}

When you compile foo.cpp, it becomes foo.o, and when you compile bar.cpp, it becomes bar.o. Now, in the process of compiling, the compiler needs to check that the definition of function foo() in foo.cpp agrees with the usage of function foo() in bar.cpp (i.e. takes an int and a float and returns an int). The way it does that is by making you include the same header file in both .cpp files, and if both the definition and the usage agree with the declaration in the header, then they must agree with each other.

But the compiler doesn't actually include the implementation of foo() in bar.o. It just includes an assembly language instruction to call foo. So when it creates bar.o, it doesn't need to know anything about the contents of foo.cpp. However, when you get to the linking stage (which happens after compilation), the linker actually does need to know about the implementation of foo(), because it's going to include that implementation in the final program and replace the call foo instruction with a call 0x109d9829 (or whatever it decides the memory address of function foo() should be).

Note that the linker does not check that the implementation of foo() (in foo.o) agrees with the use of foo() (in bar.o) - for example, it doesn't check that foo() is getting called with an int and a float parameter! It's kind of hard to do that sort of check in assembly language (at least, harder than it is to check the C++ source code), so the linker relies on knowing that the compiler has already checked that. And that's why you need the header file, to provide that information to the compiler.

David Zaslavsky
That cleared everything up, thank you very much for a thorough and detailed answer.
jimbo
A: 

One technical reason is compilation speed. Let's suppose your class uses 10 other classes (e.g. as types for member variables). Including the long .cpp files for all 10 classes would make your class compile much slower (i.e. maybe 2 seconds instead of 1 second).

Another reason is hiding the implementation. Let's suppose you are writing a class to be used by 10 other teams in your company. All they have to know and learn about your class is in the .h file (public interface). You can freely do whatever you want in the .cpp file (implementation), you may change it as often you want, they won't care. But if you change the .h file, they may have to adjust their code using your class.

For each method body, it's your choice whether to put it to the .h file or to the .cpp file. If it's in the .h file, the compiler can inline it when called, which may make the code a bit faster. But compilation will be slower, and the temporary .o (.obj) files may become larger (because each of them will contain the compiled method body), and the program binary (.exe) may become larger, because the function body takes space as many times it is inlined.

pts
The true reason is not speed, is that it will not even link! You would get an "symbol already defined in object" error!
Lorenzo
@Lorenzo: I don't think there is a ``true reason''. You are right that #including a .cpp file results in a linker error in most situations. My point is that if C++ compilation was designed the way that .cpp files have to be #included, total project compilation would be much slower.
pts
+1  A: 

The magic is done by the linker. Every .cpp when compiled will generate an intermediate object file with all the exported and imported symbols in a table. The linker will reconcile them. In other words, you just have to include the header, and every time you will reference the included class, the compiler will put the signature of the referenced class in the symbol table.

If you include the .cpp file, you will have the same code compiled twice and you will get linking errors, as the same symbol will be found twice by the linker and hence it will be ambiguous.

Lorenzo