tags:

views:

350

answers:

8

Just out of curiosity. If I have something like:

if(x < 0)
    x = 0;
if(x > some_maximum)
    x = some_maximum;

return x;

Is there a way to not branch? This is c++.

Addendum: I mean no branch instructions in the assembly. It's a MIPS architecture.

+2  A: 

Depends on your architecture. For ARM, at least, the compiler would probably emit conditionally executed instructions and the resulting machine code wouldn't contain a branch. I can't think of a good way to make that explicit in the C program though.

Carl Norum
Unless ARM's conditional instructions are much different from those used by Intel processors, it is my understanding that a `?:` is explicitly required for translation to conditional moves.
danben
ARM is much different than intel - it will use conditional run opcodes on any compare to prevent branching.
Michael Dorgan
+1 @Michael, that's certainly been my experience.
Carl Norum
@Michael Dorgan: this would still depend on the compiler, wouldn't it?
danben
Conditional executed opcodes are the bread and butter of the ARM architecture. They were designed to prevent just this sort of branching. While I guess there is a possibility of finding a compiler that won't do this, it would be a very, VERY poor compiler.
Michael Dorgan
@danben: Of course, it depends on the compiler. Any compiler is entitled to emit arbitrarily awful code. But given the way ARM's conditional opcodes are designed, a "good" ARM compiler will use them in preference to a conditional branch over a short run of code.
Steve Jessop
+2  A: 

Using the ternary operator :)

return x < 0 ? 0 : x > some_maximum ? : some_maximum : x;
Fede
That will still cause branch instructions to show up in the executable if the architecture doesn't support a non-branching way to do it, right?
Carl Norum
@Carl Norum: yes, but it's his best shot (see my answer for more details).
danben
A: 
x = min(max(x,0),100);

The branching is hidden away nicely inside functions with normal names.

Suggesting to create a clip_by template.

Pavel Radzivilovsky
+13  A: 

This is going to be compiler- and processor-dependent, but if you use ?: it can be translated to a conditional move (at least on Intel-based processors) which does not use a branch.

x = x < 0 ? 0 : x; x = x > max ? max : x;

This can use the CMOV instruction (see http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/vc35.htm), whose purpose is to avoid branching (and thus branch prediction penalties).

Edit: this thread may be of interest to you. Benchmarks show that conditional moves will give you speed gains only on branches that are not very predictable, whereas highly predictable branches (such as that of a long-running loop) prefer the standard approach.

danben
+1 for the info link
Fede
(Why) Can't the original code also be compiled to use CMOV? Assuming `x` is not volatile, of course.
Steve Jessop
... Actually, GCC with -O4 seems determined (with the OP's code and x an int) to use a cmovge for one of the limits, and a branch for the other. With your code they're both cmovs. Don't know why.
Steve Jessop
A: 
x =   ((int)(x > some_maximum)) * some_maximum 
    + ((int)(x > 0 && x <= some_maximum)) * x;
bobah
This doesn't account for the top of the range or the fact that `some_maximum` is a constant value. (Likely you are taking `some_maximum` as the value to restrict).
danben
That always loses the original value of x.
Georg Fritzsche
@Georg Fritzsche - yes, thanks, fixed the example
bobah
+17  A: 

There are bit-tricks to find the minimum or maximum of two numbers, so you could use those to find min(max(x, 0), some_maximum). From here:

y ^ ((x ^ y) & -(x < y)); // min(x, y)
x ^ ((x ^ y) & -(x < y)); // max(x, y)

As the source states though, it's probably faster to do it the normal way, despite the branch

Michael Mrozek
+1 for actually answering the question B-)
Brian Postow
Well, perhaps not completely answering the question: "On some machines, evaluating (x < y) as 0 or 1 requires a branch instruction". Also I note that the site describes the obvious approach (using a ternary operator) as possibly faster depending on the target architecture...
Matthieu M.
@Matthieu I noted that already
Michael Mrozek
+1  A: 

If it's possible to limit to powers of 2 (non inclusive), then just go with

int newx = x & ((highest power of 2) - 1)

not quite sure to handle the (if x < 0 case) or the generic (x < n case)

Dearmash
+1  A: 

For future problems like this, the bit hack page might be useful: http://graphics.stanford.edu/~seander/bithacks.html.

Since the bithack for min and max was already posted, here is a different one:

// CHAR_BIT is number of bits per byte.
// sign = 1 if x < 0, sign = 0 otherwise (according to the page above)
int sign = (int)((unsigned int)((int)x) >> (sizeof(int) * CHAR_BIT - 1));

int y = (1-sign)*x; // if x < 0, then y = 0, else y = x.

// Depending on arch, the below _might_ cause a branch.
// (on x64 it does not cause a branch, not sure about MIPS)

int z = !(y/some_maximum); // if 0 <= y < some_maximum, z = 1, else z = 0

int ret = z*y + (1-z)*some_maximum; // if z =1, then ret = y; else ret = some_maximum.

return ret;

I just tried it out, and it worked for the few test cases i had.

Here is the assembly code from my computer (intel arch) which shows no branches.

int cap(int x)
{
00F013A0  push        ebp  
00F013A1  mov         ebp,esp 
00F013A3  sub         esp,0FCh 
00F013A9  push        ebx  
00F013AA  push        esi  
00F013AB  push        edi  
00F013AC  lea         edi,[ebp-0FCh] 
00F013B2  mov         ecx,3Fh 
00F013B7  mov         eax,0CCCCCCCCh 
00F013BC  rep stos    dword ptr es:[edi] 

    int some_maximum = 100;

00F013BE  mov         dword ptr [some_maximum],64h 

    // CHAR_BIT is number of bits per byte. 
    // sign = 1 if x < 0, sign = 0 otherwise (according to the page above) 
    int sign = (int)((unsigned int)((int)x) >> (sizeof(int) * CHAR_BIT - 1)); 

00F013C5  mov         eax,dword ptr [x] 
00F013C8  shr         eax,1Fh 
00F013CB  mov         dword ptr [sign],eax 

    int y = (1-sign)*x; // if x < 0, then y = 0, else y = x. 

00F013CE  mov         eax,1 
00F013D3  sub         eax,dword ptr [sign] 
00F013D6  imul        eax,dword ptr [x] 
00F013DA  mov         dword ptr [y],eax 

    // Depending on arch, the below _might_ cause a branch. 
    // (on x64 it does not cause a branch, not sure about MIPS) 

    int z = !(y/some_maximum); // if 0 <= y < some_maximum, z = 1, else z = 0 

00F013DD  mov         eax,dword ptr [y] 
00F013E0  cdq              
00F013E1  idiv        eax,dword ptr [some_maximum] 
00F013E4  neg         eax  
00F013E6  sbb         eax,eax 
00F013E8  add         eax,1 
00F013EB  mov         dword ptr [z],eax 

    int ret = z*y + (1-z)*some_maximum; // if z =1, then ret = y; else ret = some_maximum. 

00F013EE  mov         eax,dword ptr [z] 
00F013F1  imul        eax,dword ptr [y] 
00F013F5  mov         ecx,1 
00F013FA  sub         ecx,dword ptr [z] 
00F013FD  imul        ecx,dword ptr [some_maximum] 
00F01401  add         eax,ecx 
00F01403  mov         dword ptr [ret],eax 

    return ret; 

00F01406  mov         eax,dword ptr [ret] 
}
00F01409  pop         edi  
00F0140A  pop         esi  
00F0140B  pop         ebx  
00F0140C  mov         esp,ebp 
00F0140E  pop         ebp  
00F0140F  ret              
Moron