views:

753

answers:

6

The title says it all. How does a C program get started?

+18  A: 

Eventually it is operating system. Usually there is some medium between real entry point and main function, this is inserted by compiler.

Some details (related to Windows): There is header in PE file called IMAGE_OPTIONAL_HEADER which has the field AddressOfEntryPoint, which is in turn address of the first code byte in the file that will be executed.

Andrey
+22  A: 

The operating system calls the main() function. Actually, it usually calls something else named a strange thing like _init. The C compiler links a standard library to every application which provides this operating system defined entry point and then calls main().

Edit: Obviously that was not detailed and correct enough for some people.

The Executable and Linkable Format (ELF) which many Unix OS's use defines an entry point address. That is where the program begins to run after the OS finishes its exec() call. On a Linux system this is _init.

From objdump -d:

Disassembly of section .init:

08049f08 <_init>:
 8049f08:       55                      push   %ebp
 8049f09:       89 e5                   mov    %esp,%ebp
 8049f0b:       83 ec 08                sub    $0x8,%esp
 8049f0e:       e8 a1 05 00 00          call   804a4b4 <call_gmon_start>
 8049f13:       e8 f8 05 00 00          call   804a510 <frame_dummy>
 8049f18:       e8 d3 50 00 00          call   804eff0 <__do_global_ctors_aux>
 8049f1d:       c9                      leave  
 8049f1e:       c3                      ret    

From readelf -d:

 0x00000001 (NEEDED)                     Shared library: [libstdc++.so.6]
 0x00000001 (NEEDED)                     Shared library: [libm.so.6]
 0x00000001 (NEEDED)                     Shared library: [libgcc_s.so.1]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x0000000c (INIT)                       0x8049f08
 0x0000000d (FINI)                       0x804f018
 0x00000004 (HASH)                       0x8048168
 0x00000005 (STRTAB)                     0x8048d8c
 0x00000006 (SYMTAB)                     0x804867c
 0x0000000a (STRSZ)                      3313 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x8059114
 0x00000002 (PLTRELSZ)                   688 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x8049c58
 0x00000011 (REL)                        0x8049be0
 0x00000012 (RELSZ)                      120 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffe (VERNEED)                    0x8049b60
 0x6fffffff (VERNEEDNUM)                 3
 0x6ffffff0 (VERSYM)                     0x8049a7e
 0x00000000 (NULL)                       0x0

You can see that INIT is equal to the address of _init.

The code for frame_dummy and __do_global_ctors_aux is in a set of files named crtbegin.o and crtend.o (and variants of those names). These are part of GCC. That code does various things necessary for a C program like setting up stdin, stdout, global and static variables and other things.

I believe someone else's answer already described what Windows does.

Zan Lynx
`__start`, woo.
strager
it doesn't call `_init` or any other. it calls the entry point address. it can be anywhere.
Andrey
+5  A: 

http://coding.derkeiler.com/Archive/C_CPP/comp.lang.c/2008-04/msg04617.html

I don't know why this was downvoted, the link has good information
Zack
@Zack Generally posting a link without some sort of summary here is frowned upon
Michael Mrozek
Good to know, thanks
Zack
+3  A: 

Note that in addition to the answers already posted, it is also possible for you to call main yourself. Generally this is a bad idea reserved for obfuscated code.

Brian
This isn't legal in C++, by the way - one more way in which C++ isn't a strict superset.
David Thornley
+2  A: 

The operating system calls main. There will be an address in the relocatable executable that points at the location of main (See the Unix ABI for more information).

But, who calls the operating system?

The central processing unit, on the "RESET" signal, (which is also asserted at power on), will begin looking in some ROM at a given address (say, 0xffff) for its instructions.

Typically there will be some sort of jump instruction out to the BIOS, which gets the memory chips configured, the basic hard drive drivers loaded, etc, etc. Then the Boot Sector of the hard drive is read, and the next bootloader is started, which loads the file containing the basic information of how to read, say, an NTFS partition and how to read the kernel file itself. The kernel environment will be set up, the kernel loaded, and then - and then! - the kernel will be jumped to for execution.

After all that hard work has been done, the kernel can then proceed to load our software.

Paul Nathan
+3  A: 

The operating system calls a function included in the C runtime (CRT) and linked into your executable. Call this "CRT main."

CRT main does a few things, the two most important of which, at least in C++, are to run through an array of global C++ classes and call their constructors, and to call your main() function and give its return value to the shell.

The Visual C++ CRT main does a few more things, if memory serves. It configures the memory allocator, important if using the Debug CRT to help find memory leaks or bad accesses. It also calls main within a structured exception handler that catches bad memory access and other crashes and displays them.

Drew Hoskins