views:

119

answers:

5

Now , My doubt has changed to exact point after some long discussion in this thread with dwelch.

"If you are wondering how the processor gets ready to execute the first instruction. its the way the logic is designed" HOW THE LOGIC IS DESIGNED? Can you give me more idea on this? Is there any document give me more details? Is the algorithm behind all architecture are same?

A: 

It depends on the CPU on the board - and what's in the memory (ROM or one of the variants of EPROM).

One scenario is that the processor executes the code at address 0 - so the hardware has to ensure that there is some valid code for the processor to execute at that address, such as JMP os_start to jump to the start of the operating system real code.

Another scenario is that the processor treats the startup as if a specific interrupt has occurred, and looks for the interrupt handler at a well known address that isn't zero. Again, the hardware has to ensure that there is some valid code for the processor to execute at that address.

In both cases, the code jumped to is typically a boot-strap loader - a very small program that copies some key data around, or sets up some key values, and then loads a bigger program that does more work.

Jonathan Leffler
+1  A: 

It is not possible to tell how a system (i.e. CPU) will start up since that can be implementation specific and vary between CPUs. However it usually looks something like this:

Once the system powers on the CPU uses the interrupt vector table to jump to the reset code. The CPU needs to know where that interrupt vector is to be found so it's location is usually fixed (though there are CPUs where you can change that). Assuming you're on an embedded CPU that has its code in internal flash memory this usually means that the reset vector code is a simple jump instruction to where the actual startup code. If you're application is written in assembler then that's probably your application itself and you're done.

If you have an application that is written in a higher level language like c, then the c environment has to be set up. That is, various memory segments need to get set up: .bss needs to get zeroed, .data needs to get initialized with the variables data and probably even more. The system most likely also needs to do some hardware setup, especially the clock. This part of the startup is usually written in assembler (usually "crt0.S", at least parts of this kind of startup is also required for starting a c application on a PC). After that the c environment is set up and the main application can start by calling the main function. However, in a freestanding environment (which an embedded system usually is, unless you're running embedded linux on it or something like that) there are some important differences to hosted environments. The most important one here is that a freestanding environment does not need to have a function main, nor does it need to have the usual prototype int main(int argc, char* argv[]). How exactly it looks in your system is implementation specific, but an entry function for the main program void main(void) is common.

bluebrother
Thanks.Please refer my comment on the previous answer by dwelch
Renjith G
+4  A: 

What address in the processors memory space where it starts executing is described in the chip/processor documentation. The process of starting execution at that address is true for that chip no matter what the size or purpose of the board it is soldered to. Embedded, single board computer, laptop motherboard, desktop motherboard, toaster, etc.

The schematic for the board will show where power on reset goes or comes from. If power on reset is managed by a programmable device then you may not have access to all the gory details to figure out what is going on, but for an embedded development board the vendor should have provided enough timing information for you to do your job. Some devices like flash, may require a minimum period of time from power reaching some percentage of full power (say 75% of 3.3v for example) and the time when the first read can occur. Or some devices require X number of clock cycles on the clock input before reset is released, that sort of thing. Fpga's and other similar devices that load their hardware designs from a prom of some sort, need a clock, a power on reset, a period of time to load the design, etc. And that fpga may be the memory controller for the processor for example, so before you can boot the processor you may need that fpga up and running to route your flash or ram requests, you might go so far as to have that fpga initialize dram so that it looks like sram to the processor. The processor reset would have to be held off until all that happens. Some voltage regulators or other devices have a power on good output that indicate power is regulated to the desired voltage and ready to use. A device that manages the reset may wait for that power on good before it releases reset on the processor. Where I am headed with all of this is a number of things have to happen before a processor can boot.

Once the chip/processors reset is released, it is as described in the other answers.

Depending on the processor there may be strap options on the pins of the processor that describe the boot configuration, which can affect the starting address for execution, which would be described in the processors documentation. The processor might always boot from a known address, or a table of addresses (vector table, exception table, etc, goes by many names) based on the event that happened, reset, interrupt, wakeup from low power mode/sleep, etc. The processor/chip documentation will describe the bootup process and/or how you figure out what or where the first instructions are that execute.

Once the processor finally starts to execute instructions, then depending on the system you might need to enable a usb interface so a host can enumerate the device, you may need to enable a pci interface quickly so that a host may enumerate the device/board. If you have dram you likely have to initialize a dram controller on chip or off, somewhere between the processor and the dram. If that dram or other memories have some sort of error detection or correction you likely have to initialize that memory to initialize the ECC tags. That ram has to be up and running before you can use it for your stack or to initialize .bss and .data. Some devices boot with peripherals inside the chip disabled (like rs232 ports, usb, etc) and depending on your application you may wish to just turn them on in the boot code before main() is called. There may be leds on the board indicating certain things, sometimes code is there for that.

If your compiler generates .data or .bss segments, which are a software thing BTW, the hardware has no notion of this, most programmers would prefer that those sections of memory are initialized (no doubt specified in a standard for the language). For .bss that means zero it out, for .data that means fill it in with initial values. This is usually for C or other high level languages, for assembler you might be managing this yourself, the linker may not care if the objects being linked were C or assembler and may create .data, .bss, etc sections of memory independent of the source language.

for example:

const int a=27;
int b=28;
int c;

The variable a would be placed in the executable code or .text segment. The variable b would be placed in .data because before main() is called you as a programmer would expect that memory location to contain that value. And c would be in the .bss segment because as a programmer you would hope/expect that memory location to be zero before main() is called.

The startup code also initializes the stack pointer or pointers as the case may be, it may enable the cache and do various other things.

A lot of things CAN happen between reset and main(), and it seems like we assume that is the root of your question.

Since you mention embedded you may wish to take over the startup code. The compiler tools will usually have startup code for the target processor, but the odds of it matching your embedded processors memory map is unlikely, at least when using a generic compiler like gcc. For say a PIC c compiler or rabbit semiconductor or some other compiler that is specifically for a single family of chips, where you are required to specify the chip when compiling, well that tells the compiler the memory map so that it can manage initializing these areas. Gcc/binutils supports linker scripts and has many pre-built startup modules of code for various processors, depending on the nature of the target processor and the individual that created that open source code, you might be able to manipulate some of these things without having to write your own or modify the startup code. You may wish to just do this anyway as it might be easier to modify/write than to figure out all the knobs you have to turn to get the generic one to do what you want.

You may choose to make your startup code simpler and create programming rules for the project or at least the bootloader, that all variables must be initialized before use.

for example instead of:

const int a=27;
int b=28;
int c;

something like this:

const int a=27;
int b;
int c;
void embedded_main ( void )
{
  b = 27;
  c = 0;
...

}

If you do this th startup code no longer has to zero .bss or copy .data from rom to ram before calling embedded_main(). Note some compilers add extra (sometimes) unused code to the binary if it sees a function named main(), if you understand what the compiler is doing and what that code is and if you need it or not you can rename main() to anything else to avoid bloating your binary and consuming flash/rom.

Not initializing the variable c in the above before using it (assuming it to be zero) is bad form and some compilers warn you about using a possibly uninitialized variable. So you should be initializing it anyway.

You can get a pretty good performance gain on boot by not preparing .bss and .data and say for example if you are a PCI target you might need every millisecond you can get to get up and running before the host comes around to enumerate. You dont save any prom by initializing variables at runtime, the .data segment is replaced by code in the .text segment and depending on the processor it probably consumes more flash to do it this way. so you may gain performance and portability of the startup code, relief from a lot of headaches in that starup code and linker scripts, but if you are not the developer of the high level code you call you may get grief from those developers. it is a trade off and depends heavily on the system and what you plan to do with it.

My arm startup code often looks like this for example:

.globl _start
_start:
    b   reset
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang
    b   hang

reset:
    ldr sp,=0x2000C000
    bl notmain
hang: b hang

for the cortex-m3 where the stack pointer is at address zero just before the exception vector table, then you could do something like this:

.cpu cortex-m3
.thumb

.word   0x40080000  
.word   _start      
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        
.word   hang        


.thumb_func
hang:   b .

.thumb_func
.global _start
_start:
    bl notmain
    b hang
.end

Look in the source for newlib or glibc for files named crt0.S or Start.S to find boot code for the various processors supported (usually in assembler, and as a result there will be a file for each processor type supported by that compiler/library).

dwelch
dear dwelch, Excellent answer! But i have some more ideas to be clarified. When Switch ON one processor , suppose it starts execution from 0x0(processor manufacturer specific) and executes the interrupt vector table and jumps to start up code(boot loader) and then it does some bare board initializations and after that it calls main(can be os_main or app_main depending on the system)
Renjith G
and who configures the PC and processor to starts it operation for the first time?
Renjith G
But my doubt is stick towards the first part, that is after switch ON it enters into 0x0(processor specific) and starts from there. Who configures the PC and processor to starts it operation for the first time from 0x0 or any other address? and dont we need to make the processor ready to act as processor (for fetch and execute and update ALU) ie i mean the processor specific initializations?(ALU and other registers..) Please correct me if am wrong.
Renjith G
How does the flash get programmed the first time? is that the question? sometimes the chip is in a socket you can remove it put it in a fixture to write the program into it. sometimes you use jtag or other chip specific interfaces to keep the processor from executing while programming it.
dwelch
If you are wondering how the processor gets ready to execute the first instruction. its the way the logic is designed, the reset line keeps the processor from executing. Reset reaches into all the parts of the chip that need to be initialized before it can start to execute. Looking at the document for the processor you find that it will say what the value for a register is when reset, the registers have logic that says if reset is asserted then set to this value, else operate based on the instructions being executed
dwelch
do you have a particular board and processor in mind?
dwelch
Thanks.This is what my exact question is "If you are wondering how the processor gets ready to execute the first instruction. its the way the logic is designed" HOW THE LOGIC IS DESIGNED? Can you give me more idea on this? Is there any document give me more details? Is the algorithm behind all architecture are same?
Renjith G
it is no different than software in the sense that each individual engineer or company has a way they do things. If you and I were asked to write a program to meet a particular specification our two solutions may be dramatically different, but would perform the same task. Same with an intel chip vs a motorola chip, vs texas instruments. Also like software there are design rules or bad things that could work but most people avoid. goto's for example are discouraged in C, there are similar things in hardware.
dwelch
log on to opencores.org and look at some of the processor projects. MCPU for example, the hardware logic design fits on a printed page. I looked at the openMSP430 not long ago as well. What you will see if nothing else that each of the projects in that processor category are processors and their designs likely vary widely in the implementation. but things like sram interfaces are going to look the same for example because srams are somewhat standardized, output enable, write enable, address, data, etc.
dwelch
A: 

Most CPU will start from an reset vector (0x0, 0x100, 0x80000000, 0xfffff100, 0xfffffffc) as soon as the reset is released. Depending the CPu but usually first thing would be to get the memory up and running (MMU, SDRAM, ECC & bus controllers). Then you you could do some additional hardware setup like clocking, power management, watchdogs or whatever but most of this can be done once you have hit your OS or main application. Now you will start with a runtime setup that copy's your program from Flash to RAM. You should be ready to run your application. On smaller platforms you could only need to copy your initialized data before starting.

Gerhard
Thanks,Please see my latest comment on previous answer by dwelch to know my exact doubt.
Renjith G
+2  A: 

"If you are wondering how the processor gets ready to execute the first instruction. its the way the logic is designed" - HOW THE LOGIC IS DESIGNED? Can you give me more idea on this? Is there any document give me more details? Is the algorithm behind all architecture are same?

conceptually, the Program Counter (PC) may be thought of as having an increment logic that is tied to the instuction cycle. it may be thought of as the following logic that is executed at the end of each instruction cycle.

if (RESET)         //reset pin input is asserted
    PC = 0x0000;   // or any other predetermined value
else
    PC = PC + length(currentInstruction);

for ease of understanding, think of a processor which executes 1 instruction in each clock cycle. in addition consider all instuctions to be of the same size (4 bytes as in arm).

then the logic becomes even simpler

if (RESET)
    PC = 0x0000;
else
    PC = PC + 4;

now this logical decision is simple enough to implemented as a digital logic circuit using gates, which has inputs - RESET, and PC and output - PC, and clocked by the processor clock. as long as the RESET signal is asserted, there could be additional logic that disables the execution pipeline and the fetch circuit.

while the RESET is asserted, the PC is loaded with the restart address.

now the responsibility to assert and release the RESET signal lies with the reset circuitry which may choose to do so only after it receives the POWERGOOD signal and/or CLOCKSTABLE signal.

the clockgenerator outputs the CLOCKSTABLE after the clock signal is stable and usable.

the POWERGOOD signal is asserted by the power circuitry after the voltage has stabilized.

all these signals may not be present or used in a particular platform. the RESET signal is usually found on all processors.

once the RESET signal is released, the execution pipeline is enabled and the execution logic initiates a FETCH from that address.

books on digital logic design will usually have this kind of info. and some example to show how to design an ALU. the PC incrementing algorithm behind each architecture will be similar(in concept) but has to take care of peculiarities of the processor design.

if the processor supports two differnt restart addresses depending upon a jumper setting, it can be usually achieved by having an additional pin on the processor which is connected to a jumper which may pull it to logic 1 or 0.

now the PC increment logic becomes,

if (RESET)
    if(RESETADDRESSJUMPER)
         PC = 0x0000;
    else
         PC = 0xfff0;
else
    PC = PC + 4;

similarly, it is possible to define any complex logic to reset the processor required by the processor architecture.

alvin