tags:

views:

132

answers:

4

How do you account for the difference between the size in bytes of a compiled ELF file as reported by wc (relatively large) and size (sum total of sections in file - relatively small) under Linux?

Edit: For example, compile a very simple C++ program using g++ and run 'wc myexe' and 'size myexe' and wc may return, for example 500B, whilst size may return a total of 100 bytes for all sections.

Edit II: I understand the the two commands do different things, sorry I should have said that I'm not looking for the answer 'because they're different'. I want to know what exactly accounts for the difference. Why should the wc bytes be so much larger than the total sum of the size of the sections, which after all, comprises the executable part of the code.

+2  A: 

Correct me if I'm wrong, but don't wc and size do different things? wc returns the number of characters words and lines in a file but size returns the bytes in each section. This then would account for the difference.

SteveJ
Yes, that's true but I'm looking for a deeper answer - what are the bytes returned by wc if they aren't recorded as being part of a section? Why are those extra bytes there?
obsidiank
+1  A: 

My random guess would be the ELF headers. I'm only putting that random guess here so people stop saying wc and size do different things, and point people who know the real answer in the right direction.

Omnifarious
+1  A: 

I just tested both, and apparently size only shows the size of the code, data and bss sections. There are generally lots of other sections (plus headers), which you see with:

readelf --sections file

[Edit]

Among other things, here is what takes space in your file:

  • Constructors and destructors,
  • The dynamic symbol table, which lists symbols which will be resolved at runtime,
  • I believe .init and .fini contain runtime initialization and finalization.

[Edit]

For more information on the ELF format, read:
http://www.iecc.com/linker/ (chapter 3)
http://www.sco.com/developers/gabi/2003-12-17/contents.html

Bastien Léonard
Ok, thanks. So, the symbol table isn't included in 'size'? have you got a link that describes the full contents of a file section by section, flags, tables etc? the original file was actually some assembly - so no constructors/destructors.
obsidiank
I've updated my answer.
Bastien Léonard
+2  A: 

wc just counts the number of bytes in a file (and words and lines). It works on any normal files, not just object and executable files.

size parses the headers of an object or executable file and shows information about the segments that the author of size thought would be useful, back when size was written (hint - long before Linux was born!).

readelf and some of the other binutils programs read and parse the more modern ELF format files and show more info, including segments that size doesn't know about.

If you really want to understand what is going on under the hood, you can write your own readelf-like program, starting with /usr/include/elf.h, and parse the files for yourself. :)

John Grieggs
great answer, thanks john :)
obsidiank