tags:

views:

123

answers:

4

In C programming, is there any way to determine how much a single source code file contributes to the final memory footprint?

Let us assume a simple C-program consisting of source files test1.c, test2.c, test3.c, and so on. The environment is Linux and the complier gcc.

With objdump and readelf can be seen see the total footprint and how the binary is distributed in .text, .data, and .bss segments. But is it possible to see how much binary code is generated per test1.c, how much per test2.c, etc.?

+4  A: 

No, there isn't. Most memory is allocated at run-time and cannot be deduced from examining the source files. For example, given this code:

int n;
cin >> n;
char * p = new char[n];

there is no way that examining the source can tell you how much memory will be allocated when the program is executed.

anon
Actually valgrind makes it very easy to find out where heap memory is allocated: http://valgrind.org/docs/manual/ms-manual.html . But I'm far from sure this is even what the question is asking.
Jason Orendorff
+1  A: 

That's a very strange question. Taking it at face value, you simply need to look at the .obj/.o files that are generated when you compile. Those will be the size of each module, in terms of code.

However, this doesn't take in to account any memory that's allocated when the program runs. It also doesn't take into account that parts of the program that aren't currently running aren't necessarily kept in memory.

If you're concerned about writing lots of code and it taking up all your memory, don't worry about it. It can't happen. :)

Mike Caron
+3  A: 

The question title and the contents seem to point in different directions.

If your question is how much memory will your application require at runtime per source code file, that is undecidable in the general. It might depend on external outputs that you cannot control, unless working with only constants you cannot know how deep the recursion can be (stack required) or how much dynamic memory you will required as those will surely depend on runtime information --inputs.

If your question is how much code from the final binary comes from each of the files, that you can see if you have enough of an interest. The zero-eth approximation is checking the size of the .o files that the compiler generates. That approximation is rather bad as the linker can remove unused symbols from the object files at link stage. Then you can get fancier and inspect the symbols in the final executable and look for those symbols in each one of the object files. This will provide much better information, but will require much more work.

David Rodríguez - dribeas
Your second paragraph answers exactly to what I wanted to ask. Thanks a lot for this answer :)
ika
+1  A: 

No, fundamentally not.

For instance, take two source files that both contain the string "Hello, world\n". Most linkers are able to fold these string literals. Only one string literal remains, how should this be accounted for? A similar thing happens even for functions. For instance, std::vector<int>::push_back(int) and std::vector<long>::push_back(long) may generate the same executable code, and linkers may leave only one instance.

Furthermore, consider vector<int>::push_back(int) again. It's actually coming from a header, <vector> which will be included in many .cpp files. But the compiler typically doesn't record that at all - test1.o contains everything included by test1.cpp as well.

MSalters