tags:

views:

474

answers:

4

I'm working on a large c++ built library that has grown by a significant amount recently. Due to it's size, it is not obvious what has caused this size increase.

Do you have any suggestions of tools (msvc or gcc) that could help determine where the growth has come from.

edit Things i've tried: Dumpbin the final dll, the obj files, creating a map file and ripping through it.

edit again So objdump along with a python script seems to have done what I want.

A: 

G'day,

If you have any previous versions of the object file laying around can you run the size command to see which segment has grown?

Couple of questions:

  • Are you on a *nix platform or a Windows platform?
  • Which compiler are you using?
  • Was the compiler recently changed?
  • Was the -g flag added recently? (obvious question 1)
  • Was the object previously stripped? (obvious question 2)
  • Was the object dynamically linked previously? (obvious question 3)

Edit: If the code is under SCM, can you check out a version of the source that gave you the smaller object. Then compare:

  1. the size of the source trees by doing a du -sk on the old source tree and then the new source tree without having built anything.
  2. the number of files by doing something like find ./tree_top ( -name *.h -o -name *.cpp ) | wc -l
  3. the location of an increased number of files by doing a find ./tree_top ( -name *.h -o -name *.cpp ) -print | sort > treelist and then do the same for the new larger tree. Doing a simple sdiff will show any large number of new files.
  4. the size of code base, even a simple count of trailing semi-colons will give you a good basic mechanism for comparison between the two.
  5. the Makefiles or build env. for the project to see if different options or settings have crept in to the build itself.

HTH

BTW Please post your findings here as I'm sure many people are interested in what you find out.

cheers,

Rob Wells
Both platforms, msvc90 and gcc4, no, no, no, no.
Donblas
That's not really an answer
Igor Oks
@Igor, that's not really enough info to provide an answer! (-:
Rob Wells
I'm looking to find the specific places that increased final file size, not all the places that have changed. I could have added one place where we're instancing a huge number of templates, for example, that wouldn't show up with that method.
Donblas
+3  A: 

If gcc, objdump. If visual studio, dumpbin.

I'd suggest doing a diff of the output of the tool for the old (small) library, vs. the new (large) library.

KeyserSoze
I tried dumpbin. I show a pretty consistent increase in both data and code sections, so that didn't tell me much.
Donblas
Do you do dumpbin without arguments? IIRC that outputs a summary of information, while dumpbin /all will be quite a bit more verbose.http://msdn.microsoft.com/en-us/library/kw9f57kb(VS.71).aspx I'd suggest piping dumpbin /all to a file for both binaries, and diffing them, if you haven't done that already.
KeyserSoze
Umm, a /all dumps the entire object files's disassembly do disk. I get a .text file with a half a million lines. That's impossible to diff well. I was using /headers to get an idea of what increased.
Donblas
Sorry, I didn't try /all out myself, I only read the link I provided that indicated it wouldn't output the disassembly. With objdump though, I had good luck in dumping large amounts of information, then sorting it programatically to find where the large sizes come from.
KeyserSoze
If the C++ function name mangling is consistent from your old library to your new one, I'd be tempted to try writing a perl script that populates a hash table with hash key of the symbol name, hash value of the size of the symbol, for the old library. Then check if symbols from the new library are in the hash, only print them if they are at least 10% bigger, and sort the output based on the size. Hopefully then you'll get a short-ish list of things to investigate manually.
KeyserSoze
ObjDump looks promising. Hacking up a tool to parse it's output. I'll let you know if it works.
Donblas
+1  A: 

On Linux it should be quite easy to see if new files have been added with a recursive diff. They would certainly create an increase in the library size. You can then go an use the size command line tool on Linux to get the sizes of each of the new object files and sum them up. Then compare that sum to your library increase and check how much it differs.

lothar
The size of the object files does not necessarily match the main dll size increase. If you have template instantiations in separate files, each obj file will have a copy but the linker will throw many of them away.
Donblas
@iferrorthrownewbrick That may be so, but new files almost certainly contribute "something" to the library, as they simply were not there in the last version. ;-)
lothar
A: 

keysersoze's answer (compare the output of objdump or dumpbin) is correct. Another approach is to tell the linker to produce a map file, and compare the map files for the old and new versions of the DLL.

  • MSVC: link.exe /MAP
  • GCC and binutils: ld -M (or gcc -Wl,-M)
bk1e