tags:

views:

1168

answers:

5

I don't quite understand the compiling process of the Linux kernel when I install a Linux system on my machine.

Here are some things that confused me:

  1. The kernel is written in C, however how did the kernel get compiled without a compiler installed?
  2. If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?

I was so confused for a couple of days, thanks for the response.

+59  A: 

The first round of binaries for your linux box were built on some other linux box (probably).

The binaries for the first linux system were built on some other platform.

The binaries for that computer can trace their root back to a original system that was built on yet another platform.

...

Push this far enough, and you find compilers built with more primitive tools, which were in turn built on machines other than their host

...

keep pushing and you find computers built so that their instructions could be entered by setting switches on the front panel of the machine.

Very cool stuff.

The rule is "build the tools to build the tools to build the tools...". Very much like the tools which run our physical environment. Also known as "pulling yourself up by the bootstraps".

dmckee
+1 your answer has a flavor of nice
Ric Tokyo
Reminds me of the fact that the gentoo linux install uses a downloaded compiler to compile its own source, then lets the resulting compiler compile its own source *again*, as if the first result is somehow still "unclean" :-)
Wim Coenen
Not necessarily unclean. Just unoptimized. the first compiler will be optmized to work on 386, but the recompiled version optmized for whatever architecture you have.
Breton
GCC builds are pretty unclean. That's the reason it builds twice. The first build will include some reference to the build host, which is removed by self-hosting the build on the target.
Adam Hawes
@Adam The first build is generated by the existing compiler, which maybe and old gcc or another compiler, and may include bugs not related to the source that you are compiling. The second build is completely generated by the output of the first stage.
Ismael
You can add a third stage, if everything is ok the second stage output should be equal to the output of the third stage.
Ismael
It's not just software, it's hardware to. There is no way anything like a P4 (or even a 486) could be built without a computer.
BCS
@BCS: Oh, yes. We've reached the point were are software and hardware tools are deeply interlinked and interdependent.
dmckee
There are people who (as a hobby!) try to do the hand-tools-to-machine-shop boot: http://www.lindsaybks.com/dgjp/djgbk/series/index.html;. Now *that* has the hacker nature. In spades.
dmckee
I really like the fact that this post has 42 upvotes :)
Camilo Díaz
You write really well, dmckee.
+5  A: 

The kernel doesn't compile itself -- it's compiled by a C compiler in userspace. In most CPU architectures, the CPU has a number of bits in special registers that represent what privileges the code currently running has. In x86, these are the current privilege level bits (CPL) in the code segment (CS) register. If the CPL bits are 00, the code is said to be running in security ring 0, also known as kernel mode. If the CPL bits are 11, the code is said to be running in security ring 3, also known as user mode. The other two combinations, 01 and 10 (security rings 1 and 2 respectively) are seldom used.

The rules about what code can and can't do in user mode versus kernel mode are rather complicated, but suffice to say, user mode has severely reduced privileges.

Now, when people talk about the kernel of an operating system, they're referring to the portions of the OS's code that get to run in kernel mode with elevated privileges. Generally, the kernel authors try to keep the kernel as small as possible for security reasons, so that code which doesn't need extra privileges doesn't have them.

The C compiler is one example of such a program -- it doesn't need the extra privileges offered by kernel mode, so it runs in user mode, like most other programs.

In the case of Linux, the kernel consists of two parts: the source code of the kernel, and the compiled executable of the kernel. Any machine with a C compiler can compile the kernel from the source code into the binary image. The question, then, is what to do with that binary image.

When you install Linux on a new system, you're installing a precompiled binary image, usually from either physical media (such as a CD DVD) or from the network. The BIOS will load the (binary image of the) kernel's bootloader from the media or network, and then the bootloader will install the (binary image of the) kernel onto your hard disk. Then, when you reboot, the BIOS loads the kernel's bootloader from your hard disk, and the bootloader loads the kernel into memory, and you're off and running.

If you want to recompile your own kernel, that's a little trickier, but it can be done.

Adam Rosenfield
+2  A: 

Which one was there first? the chicken or the egg?

Eggs have been around since the time of the dinosaurs..

..some confuse everything by saying chickens are actually descendants of the great beasts.. long story short: The technology (Egg) was existent prior to the Current product (Chicken)

You need a kernel to build a kernel, i.e. you build one with the other.

The first kernel can be anything you want *(preferably something sensible that can create your desired end product ^__^)*

This tutorial from Bran's Kernel Development teaches you to develop and build a smallish kernel which you can then test with a Virtual Machine of your choice.

Meaning: you write and compile a kernel someplace, and read it on an empty (no OS) virtual machine.

What happens with those Linux installs follows the same idea with added complexity.

Ric Tokyo
+5  A: 

The term describing this phenomenon is bootstrapping, it's an interesting concept to read up on. If you think about embedded development, it becomes clear that a lot of devices, say alarm clocks, microwaves, remote controls, that require software aren't powerful enough to compile their own software. In fact, these sorts of devices typically don't have enough resources to run anything remotely as complicated as a compiler.

Their software is developed on a desktop machine and then copied once it's been compiled.

If this sort of thing interests you, an article that comes to mind off the top of my head is: Reflections on Trusting Trust (pdf), it's a classic and a fun read.

+6  A: 

I think you should distinguish between:

compile, v: To use a compiler to process source code and produce executable code [1].

and

install, v: To connect, set up or prepare something for use [2].

Compilation produces binary executables from source code. Installation merely puts those binary executables in the right place to run them later. So, installation and use do not require compilation if the binaries are available. Think about ”compile” and “install” like about “cook” and “serve”, correspondingly.

Now, your questions:

  1. The kernel is written in C, however how did the kernel get compiled without a compiler installed?

The kernel cannot be compiled without a compiler, but it can be installed from a compiled binary.

Usually, when you install an operating system, you install an pre-compiled kernel (binary executable). It was compiled by someone else. And only if you want to compile the kernel yourself, you need the source and the compiler, and all the other tools.

Even in ”source-based” distributions like gentoo you start from running a compiled binary.

So, you can live your entire life without compiling kernels, because you have them compiled by someone else.

  1. If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?

The compiler cannot be run if there is no kernel (OS). So one has to install a compiled kernel to run the compiler, but does not need to compile the kernel himself.

Again, the most common practice is to install compiled binaries of the compiler, and use them to compile anything else (including the compiler itself and the kernel).

Now, chicken and egg problem. The first binary is compiled by someone else... See an excellent answer by dmckee.

jetxee