views:

881

answers:

10
test.c:

int main()
{
    return 0;
}

I haven't used any flags (I am a newb to gcc) , just the command:

gcc test.c

I have used the latest TDM build of GCC on win32. The resulting executable is almost 23KB, way too big for an empty program.

How can I reduce the size of the executable?

+8  A: 

By default some standard libraries (e.g. C runtime) linked with your executable. Check out keys --nostdlib --nostartfiles --nodefaultlib for details. Link options described here.

For real program second option is to try optimization options, e.g. -Os (optimize for size).

Kirill V. Lyadvinsky
This is correct, but usually you _want_ those libraries.
Kristof Provost
That's right. These keys I've used only for embedded systems.
Kirill V. Lyadvinsky
What do you recommend to start with? (I am new to GCC, but I have used C a lot in VisualCpp before)
Richard J. Terrell
If you're familiar with C it is appropriate to start from learning differences between gcc and VisualCpp.
Kirill V. Lyadvinsky
Exactly, Kristof. It's rather pointless. Learning how to make empty programs as small as possible doesn't necessarily translate into knowledge of how to make non-trivial programs small. All you're left with is a bunch of empty programs. Focus on getting something *worth* fine-tuning, first.
Rob Kennedy
+20  A: 

How can I reduce its size?

  • Don't do it. You just wasting your time.
  • Use -s flag to strip symbols (gcc -s)
maykeye
+6  A: 

Actually, if your code does nothing, is it even fair that the compiler still creates an executable? ;-)

Well, on Windows any executable would still have a size, although it can be reasonable small. With the old MS-DOS system, a complete do-nothing application would just be a couple of bytes. (I think four bytes to use the 21h interrupt to close the program.) Then again, those application were loaded straight into memory. When the EXE format became more popular, things changed a bit. Now executables had additional information about the process itself, like the relocation of code and data segments plus some checksums and version information. The introduction of Windows added another header to the format, to tell MS-DOS that it couldn't execute the executable since it needed to run under Windows. And Windows would recognize it without problems. Of course, the executable format was also extended with resource information, like bitmaps, icons and dialog forms and much, much more.

A do-nothing executable would nowadays be between 4 and 8 kilobytes in size, depending on your compiler and every method you've used to reduce it's size. It would be at a size where UPX would actually result in bigger executables! Additional bytes in your executable might be added because you added certain libraries to your code. Especially libraries with initialized data or resources will add a considerable amount of bytes. Adding debug information also increases the size of the executable.

But while this all makes a nice exercise at reducing size, you could wonder if it's practical to just continue to worry about bloatedness of applications. Modern hard disks will divide files up in segments and for really large disks, the difference would be very small. However, the amount of trouble it would take to keep the size as small as possible will slow down development speed, unless you're an expert developer whom is used to these optimizations. These kinds of optimizations don't tend to improve performance and considering the average disk space of most systems, I don't see why it would be practical. (Still, I do optimize my own code in similar ways but then again, I am experienced with these optimizations.)


Interested in the EXE header? It's starts with the letters MZ, for "Mark Zbikowski". The first part is the old-style MS-DOS header for executables and is used as a stub to MS-DOS saying the program is not an MS-DOS executable. (In the binary, you can find the text 'This program cannot be run in DOS mode.' which is basically all it does: displaying that message. Next is the PE header, which Windows will recognise and use instead of the MS-DOS header. It starts with the letters PE for Portable Executable. After this second header there will be the executable itself, divided in several blocks of code and data. The header contains special reallocation tables which tells the OS where to load a specific block. And if you can keep this to a limit, the final executable can be smaller than 4 KB, but 90% would then be header information and no functionality.

Workshop Alex
As for a DOS application, a simple ret will do. That is, 1 byte.
Bruno Reis
A ret would do, but the official rule was that you had to call the "Exit" interrupt.
Workshop Alex
I've built real Windows executables (PE format) that do useful things in <4KB, using VS2005. So a do-nothing executable certainly doesn't have to be 8KB. (Why? Autorun checker for a CD, don't start a large installer EXE if app is already installed)
MSalters
The code does not do nothing - it returns zero to the environment.
Jonathan Leffler
+2  A: 

What is the purpose of this exercise?

Even with as low a level language as C, there's still a lot of setup that has to happen before main can be called. Some of that setup is handled by the loader (which needs certain information), some is handled by the code that calls main. And then there's probably a little bit of library code that any normal program would have to have. At the least, there's probably references to the standard libraries, if they are in dlls.

Examining the binary size of the empty program is a worthless exercise in and of itself. It tells you nothing. If you want to learn something about code size, try writing non-empty (and preferably non-trivial) programs. Compare programs that use standard libraries with programs that do everything themselves.

If you really want to know what's going on in that binary (and why it's so big), then find out the executable format get a binary dump tool and take the thing apart.

Michael Kohne
Given that you don't know the OP's motivations, that's simply not true. He might be interested in getting into embedded development, where code size matters a lot, for instance.
Novelocrat
Code size of the empty program is still completely irrelevant. And if he's into embedded programming where size of program matters, then anything he does fooling with a windows compiler is irrelevant.
Michael Kohne
Code size of an empty program is not irrelevant when 1. you code demos, 2. you are interested in how the compilation work, what ends up in the final executable, 3. and finally when you know that an empty program should not be ~23KB. There might be no obvious uses of something like this, but it doesn't make learning about the compiler flags irrelevant.
Richard J. Terrell
And in the end I have learnt why was it ~23KB originally.
Richard J. Terrell
Richard, why do you code empty programs as demos? And if you don't know what's in the final executable, then how do you know it shouldn't be 23 K? And if you haven't learned why it was 23 K, then perhaps it's because you never asked.
Rob Kennedy
+31  A: 

Don't follow its suggestions, but for amusement sake, read this 'story' about making the smallest possible ELF binary.

Novelocrat
+1 That's a fun read!
Tyler McHenry
Shit, this wasn't supposed to be taken seriously. It's now the most up-voted answer I've given!
Novelocrat
The article linked in http://stackoverflow.com/questions/553029/what-is-the-smallest-possible-windows-pe-executable is also interesting.
bk1e
@Novelocrat Yeah, I upvoted because the link you posted was very interesting, not because I think the OP should do anything like this. I *hope* most of the other upvotes were for the same reason.
Tyler McHenry
+3  A: 

I like the way the DJGPP FAQ addressed this many many years ago:

In general, judging code sizes by looking at the size of "Hello" programs is meaningless, because such programs consist mostly of the startup code. ... Most of the power of all these features goes wasted in "Hello" programs. There is no point in running all that code just to print a 15-byte string and exit.

Sinan Ünür
The whole point of the empty program is to see the overhead. I'am simply interested how the compilation works, what ends up in a compiled binary aside from the code I put there.
Richard J. Terrell
Richard, that's not at all what you asked in your question. You asked how to get rid of the overhead. You didn't ask what the overhead consisted of.
Rob Kennedy
Sure, but the method of getting rid of it tells me what was there.
Richard J. Terrell
A: 

Using GCC, compile your program using -Os rather than one of the other optimization flags (-O2 or -O3). This tells it to optimize for size rather than speed. Incidentally, it can sometimes make programs run faster than the speed optimizations would have, if some critical segment happens to fit more nicely. On the other hand, -O3 can actually induce code-size increases.

There might also be some linker flags telling it to leave out unused code from the final binary.

Novelocrat
-Os makes no difference to this code.
Norman Ramsey
Unsurprising, in this case. There's not much code that GCC is actually touching here.
Novelocrat
+9  A: 

Give up. On x86 Linux, gcc 4.3.2 produces a 5K binary. But wait! That's with dynamic linking! The statically linked binary is over half a meg: 516K. Relax and learn to live with the bloat.

And they said Modula-3 would never go anywhere because of a 200K hello world binary!


In case you wonder what's going on, the Gnu C library is structured such as to include certain features whether your program depends on them or not. These features include such trivia as malloc and free, dlopen, some string processing, and a whole bucketload of stuff that appears to have to do with locales and internationalization, although I can't find any relevant man pages.

Creating small executables for programs that require minimum services is not a design goal for glibc. To be fair, it has also been not a design goal for every run-time system I've ever worked with (about half a dozen).

Norman Ramsey
+1  A: 

Run strip on the binary to get rid of the symbols. With gcc version 3.4.4 (cygming special) I drop from 10k to 4K.

You can try linking a custom run time (The part that calls main) to setup your runtime environment. All programs use the same one to setup the runtime environment that comes with gcc but for your executable you don't need data or zero'ed memory. The means you could get rid of unused library functions like memset/memcpy and reduce CRT0 size. When looking for info on this look at GCC in embedded environment. Embedded developers are general the only people that use custom runtime environments.

The rest is overheads for the OS that loads the executable. You are not going to same much there unless you tune that by hand?

Gerhard
+2  A: 

What does 'size a.out' tell you about the size of the code, data, and bss segments? The majority of the code is likely to be the start up code (classically crt0.o on Unix machines) which is invoked by the o/s and does set up work (like sorting out command line arguments into argc, argv) before invoking main().

Jonathan Leffler