tags:

views:

365

answers:

7

So, SUPER low-level what does an IF() look like, how is it handled by an x86 processor?

+13  A: 

The processor has "Branch if" instructions that when a certain condition is met it branches, and otherwise it continues on to the next instruction.

So

if(A)
{
    dosomething;
}

would become

load A into register 0
if the zero flag is set (ie, register 0 contains 0x00) then jump to endcondition)
dosomething
endcondition:

More complex conditions ( if(A || B && C) ) become a sequence of instructions that leaves a register in a 0 or non zero state, so the branchif instruction can jump or not jump based on the conditional flags.

There are many conditional flags (zero, carry, negative, overflow, etc), and some branchif instructions also operate on more complex conditions (ie, it might actually check to see if a register is equal to another register, rather than simply looking at flags). Each architecture is different and makes tradeoffs so the instruction set is complete, but also speedy and compact.

As moocha points out in the comments, some architectures allow you to apply a conditional to some, many, or even all instructions, so you might not only have 'branch if' instructions, but also 'and if', 'add if', 'move if' etc.

The x86 is very, very, very complex beyond this simple explanation once you get into pipelining, out of order execution, caching, microcode, and all the other advanced topics. For most purposes the above explanation is sufficient. If you're writing a hand crafted very, very tightly wound algorithm, though, you'll have to take these things into account for maximum performance and throughput.

That's a topic for another question though...

Adam Davis
This does not necessarily hold true. The first counter-example that comes to mind is the ARM architecture, where you'd typically use a condition code in front of the instruction instead of a branch.
Mihai Limbășan
That's a good point - some architectures allow you to set a condition on many, if not most, instructions. Wastes a lot of space in code, though, since most of your instructions will be running without condition.
Adam Davis
Never mind, the poster clarified the question :)
Mihai Limbășan
Now add a discussion of pipelining and caching, and you've got the perfect answer.
Michael Myers
True, you can definitely go overboard with the 4b prefix :) I was picking nits, of course.
Mihai Limbășan
It's pretty close to perfect as it is now, I'd say :)
Mihai Limbășan
A: 

Basically, you have a bunch of electrons being passed around among various atoms inside of your CPU. Because of the structure of the silicon atoms in your CPU, the electrons follow certain paths, which determine the branch of execution the computer will follow.

EDIT: It seems I should explain a little less vaguely. Bear with me, I majored in computer science, not electrical engineering, so I don't have a very deep understanding of these things:

Your CPU is made of a a material, usually silicon, which is called a "semiconductor". One of the great things about semiconductors is that their elecrical properties can be easily varied through "doping", or applying impurities that create areas of negative or positive "charge carriers" on the material. The lines where these areas come together are known as "junctions", and electricity flows much more easily one way across these junctions than the other. This property is exploited to create diodes, which allow electricity to flow only in one direction, and transistors, which can be thought of as tiny switches that allow one electrical current to control another electrical current. These transistors and diodes are combined in a myriad of ways to create the logic gates of your CPU.

Many of the logic gates inside of your CPU are dedicated to being the "control unit", which is in charge of retrieving and decoding instructions, telling the rest of the CPU what to do, and finally getting the next instruction. On the x86, the control unit is actually running "microcode" which tells it how to deal with branching, pipelining, and so on. You really need to be incredibly specific about a particular line of processors to get into how the x86 ISA is implemented on a particular microarchitecture.

Adam Jaskiewicz
But it seems to be pretty empty of substance, (by virtue of being vague) "By virtue of the structure of the silicon atoms ..." ????
Charles Bretana
I agree, I had voted it +1.
ashawley
+1 because it is technically correct however this is a bit "lower level" then I was looking for.
Unkwntech
I'm not sure I'd call it technically correct; transistors are made of very slightly impure silicon, and it's the combination of the silicon and impurities that cause the electrons to flow as they do. It's also hopelessly vague and gleefully jumps many layers of abstraction at one bound.
David Thornley
I've made it less vague. It still gleefully jumps many layers of abstraction, but it uses two or three bounds, now.
Adam Jaskiewicz
+2  A: 

It's a branch instruction, dependent on the specific machine architecture. It figures out how to set up a memory location or register to test for a specific low-level condition - like branch-if-not-equal or branch-if-not-zero, ... -- does that test then jumps (or doesn't if the condition fails) to another part of memory. Obviously if you have a complex condition it may need to do evaluate many different conditions and may involve several branch instructions.

tvanfosson
+1  A: 

Generally the CPU has what is called an Instruction register, which holds the memory address of the current machine language opcode to be executed next... and numerous other registers to hold data.

Generally, after the cpu executes each opcode in the the instruction register, it simply increments it by one to move to the next position in memory which should have the next opcode in the compiled program application.

One opcode (actually there are probably several), however allows the cpu to "Branch", by "comparing" the values in two other cpu registers, and if one is greater than the other, it copies one memory address into the instruction register, whereas if the other is the largest, it copies a second, different memory address into the instruction register.

That's about as "low" level as it can be put it w/o talking about relays and transistors...

Charles Bretana
+5  A: 

It's fairly easy to use the output of a C compiler (use the -S switch on gcc) to see what output a given snippet of C will generate when compiled. Be careful when using optimisation on toy programs though. If you're not careful the optimiser will often optimise away conditionals that will always go one way or another (see this article on microbenchmarks for a more detailed explanation).

For example, a trivial C program:

#include <stdio.h>

int main (int argc, char **argv) {
    int ii = 10;
    int jj = 20;
    if (jj > ii) {
        puts ("jj > ii \n");
    }
    return 0;
}

compiles to the following assembly language:

    .file "foo.c"
    .section .rodata
.LC0:
    .string "jj > ii \n"
    .text
.globl main
    .type main, @function
main:
    leal 4(%esp), %ecx
    andl $-16, %esp
    pushl -4(%ecx)
    pushl %ebp
    movl %esp, %ebp
    pushl %ecx
    subl $20, %esp
    movl $10, -8(%ebp)
    movl $20, -12(%ebp)
    movl -12(%ebp), %eax
    cmpl -8(%ebp), %eax
    jle .L2
    movl $.LC0, (%esp)
    call puts
.L2:
    movl $0, %eax
    addl $20, %esp
    popl %ecx
    popl %ebp
    leal -4(%ecx), %esp
    ret
    .size main, .-main
    .ident "GCC: (Ubuntu 4.3.2-1ubuntu12) 4.3.2"
    .section .note.GNU-stack,"",@progbits

For a brief dissection of what's going on:

  • The first section (.rodata) declares a constant with the string 'jj > ii \n')

  • The second section is initialising the contents of the ii and jj variables on the stack.

  • The bit from cmpl -8(%ebp), %eax is doing the actual comparison; the jle instruction is skipping over the call to 'puts', which is effectively the logic of the 'if' statement reversed.

  • After the label '.L2' the system is tidying up the top of the stack and returning from the call.

ConcernedOfTunbridgeWells
A: 

Here's a pretty good overview of how such structures might compile on the x86 architecture: http://en.wikibooks.org/wiki/X86_Disassembly/Branches#If-Then

There are ways to sometimes avoid a branch (which often has strongly negative performance implications due to pipeline breaking). For example the i686 instruction set onwards (everything from Pentium Pro to the current day) has a conditional move instruction which might compile this:

if (a==0) {
    b= 1;
}

to something like this:

cmp    0, eax
cmovzl ebx, 1

with no branch, as long as your compiler is set up to target i686+ (and it feels like it; compilers are complex and inscrutable). SET[condition] is another, similar, conditional instruction.

Lucky old ARM programmers get to specify any instruction conditionally, which cuts down on branches a lot.

bobince
A: 

Although the majority of if statements will end up being conditional branches, for very simple cases where neither branch has any side-effects, an optimising compiler may generate code which executes both and calculates the result rather than only executing one. This can have advantages in pipelined architectures where the average cost of calculating both branches is less than the average cost due to branch prediction misses.

For example, the code:

int x;

if ( y < 5 )
  x = 5;
else
  x = y;

can be compiled as if it were written:

y -= 5
int r = y < 0; // r is 1 if y < 5, 0 otherwise
r -= 1         // r is 0x00000000 if y < 5, 0xffffffff otherwise
x = y & r      // x is 0 if y < 5, (y-5) otherwise 
x += 5;        // x is 5 if y < 5, y otherwise

which can be converted to machine code without any branches

Pete Kirkham