views:

293

answers:

8

I understand how to do it for powers of 2 so that's not my question.

For example, if i want to find 5% of a number using a bit shift instead of an integer divide, how would i calculate that?

So instead of (x * 20 / 19), i could do (x * 100 >> 11). Now this isnt right but it's close and i arrived at it using trial and error. how would i determine the most possible precise shift to use?

Thanks.

A: 

Well generally:

  • obtain the prime factorisation of the number, you'd decompose N into 2^k * rest, then you can use bit shifting on the two power. Example: 20 = 2^2 * 5, so to multiply by twenty, you'd multiply by 5 and then use bit shifting << 2
  • To use bit shifting on non-two powers, observe the following for odd l: a * l = a * (l - 1) + a, now l - 1 is even and thusly decomposes into a two power, for which the bit shifting 'trick' applies.

Division can be constructed similarly.

hroptatyr
That makes no sense. Multiplication by 5 includes any cost of shifting `<< 2`. The object here is to multiply by any rational number in just one or two instructions with no division, not to decompose the number and use an indefinite number of insns.
Potatoswatter
Who said that? The OP wants to know how to turn integer multiplication into bit-shifting, I've just described the general procedure.
hroptatyr
Oh and btw, never judge before you've measured, I've just found that an `imul` would be 3 cycles on my CPU whereas my solution with a `shl` and an `add` takes 2 cycles.
hroptatyr
An `shl` and an `add` only accomplishes multiplication by 5. You still need another insn to shift again. The compiler should be smart enough to figure it out and not produce the `imul` if its really inferior, although for portability it might not be specialized to your chip, and the higher instruction count might cause other congestion.
Potatoswatter
Anyway, the question isn't so much about replacing the multiplication as the division, which you don't address at all. That requires getting the high-order result of multiplication, which can't be represented using C operators. (At least, not obtaining the full width of an integer register.) It's a fixed-point math trick.
Potatoswatter
+10  A: 

Best approach is to let the compiler do it for you. You simply write

a/b

in your language of choice, and the compiler generates the bit twiddling.

EDIT (I hope you don't mind, i'm adding reinforcement to your answer:

#include <stdio.h>

int main(int argc, char **argv) {
  printf("%d\n", argc/4);
}

Obviously, the fastest thing to do is argc>>2. Lets see what happens:

        .file   "so3.c"
        .section        .rodata
.LC0:
        .string "%d\n"
        .text
.globl main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        andl    $-16, %esp
        subl    $16, %esp
        movl    8(%ebp), %eax
        movl    %eax, %edx
        sarl    $31, %edx
        shrl    $30, %edx
        leal    (%edx,%eax), %eax
        sarl    $2, %eax
        movl    %eax, %edx
        movl    $.LC0, %eax
        movl    %edx, 4(%esp)
        movl    %eax, (%esp)
        call    printf
        leave
        ret
        .size   main, .-main
        .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
        .section        .note.GNU-stack,"",@progbits

yup, there it is, sarl $2, %eax

EDIT 2 (Sorry to pile on, but 20/19 is a bit more complicated…)

I just substituted argc*20/19 for argc/4 and this is the math that comes out:

0000000100000f07        shll    $0x02,%edi
0000000100000f0a        movl    $0x6bca1af3,%edx
0000000100000f0f        movl    %edi,%eax
0000000100000f11        imull   %edx
0000000100000f13        sarl    $0x03,%edx
0000000100000f16        sarl    $0x1f,%edi
0000000100000f19        subl    %edi,%edx

So, the process is

  • Multiply input by 4 (shll)
  • Load (movl 0x...) and multiply by (imull) a fixed-point fraction obtaining a 64-bit result (this is 32-bit code)
  • Divide high-order 32 bits of result by 8 (sarl), note how this handles negative numbers
  • Divide low-order 32 bits of result by INT_MAX (sarl) to obtain either 0 or -1
  • Correctly round the high-order result by adding 1 (subtracting -1) if necessary.
High Performance Mark
+1 - working out the bits by hand is a chore, and the best way to learn the process is to look at compiled output.
Potatoswatter
I added compiler output to demonstrate just how right you are!
TokenMacGuy
+1 Love the assembler code!
Larry K
@Potatoswatter: I fell baaaddd earning so much rep from your efforts. Not very bad, it won't keep me awake at nights, but a little bit bad :-)
High Performance Mark
@Mark: Meh, if I'd gone to the trouble of describing it in general terms, that would be more helpful. No point in letting rep actually decide anything.
Potatoswatter
+1  A: 

Suppose you want to approximate 5% of x by multiplying by y and shifting by n. Since 5% is 1/20, and a>>n = a/2n, you want to solve

x/20 ≈ x*y/2n (the symbol "≈" means "approximately equal")

which simplifies to

y ≈ 2n/20

So if n=11, then

y ≈ 2n/20 = 2048/20 =102 + 8/20

So we can set y=102, which is actually better than the 100 you found by trial and error.

Generally, we can play with n to see whether we can get a better answer.

I've worked this out for the fraction 1/20, but you should be able to work this out for any fraction p/q by following the same method.

brainjam
+3  A: 

Suppose you have the expression a = b / c. As hroptatyr mentioned, the multiplication is quite fast (and it's much faster than division). So the basic idea is to transform the division into multiplication like : a = b * (1/c).

Now, we still need division for computation of reciprical 1/c, so this would work only if c is known apriori. While for floating point computation it's enough, for intereges we have to use another trick: we can use for reciprocal of the value of c the value some_big_number / c, so that finally we'll compute a2 = b * (some_big_number / c), that is equal to some_big_number * b/c. Because we're interested in value of b/c, we have to divide the final result by some_big_number. If it's choosed to be a power of 2, then the final division would be fast.

ex:

// we'll compute 1/20 of the input
unsigned divide_by_20(unsigned n){
    unsigned reciprocal = (0x10000 + 20 - 1) / 20; //computed at compile time, but you can precompute it manually, just to be sure
    return (n * reciprocal) >> 16;
}

EDIT: a good part of this method is that you can choose any rounding method for the divison by choosing the correction (in this case it was 20 - 1 for rounding towards zero).

ruslik
For signed values, divide by 65536 instead of shifting by 16, the compiler will convert to a shift and fix-up.
ergosys
+2  A: 

You can't do everything with shifts, you will instead need to use 'magic' divisors(see hackers delight). Magic division works by multiplying a number by another suitably large number, rolling it over in such a way as to yield the answer of division(mul/imul is faster than div/idiv). There magic constants are only unique for each prime, multiples require a shift, eg: unsigned division by 3 can be represented (on 32 bit) as x * 0xAAAAAAAB, division by 6 would be (x * 0xAAAAAAAB) >> 1 division by 12 would shift by 2, 24 by 3 etc (its the geometric series 3 * (2 ^ x), where 0 <= x < 32)

Necrolis
+2  A: 

It makes no sense because what you are trying to do does not optimise the resulting process!!!

Hey, I did not read anywhere in your question that you had intention to optimise.

Electrical Engg people never stop being curious regardless of "usefulness". We are like compulsive obsessive hoarders of items of whom you read in the news where they stack their attics, cellars, bedrooms and living rooms up with junk which they believe would come in handy one day. At least that was the case when I was in Engg school a little less than 30 years ago. I encourage you to continue in your quest to hoard up "useless" knowledge that appears to have little possibilities of optimising your life or life-style. Why depend on the compiler when you can do it by hand-coded algorithm?! Yah? Be a little adventurous, you know. Ok enuf dissing people who express disdain at your pursuit of knowledge.

Recall in your middle-school, the way you were taught to do your division? 437/24, e.g.

  _____
24|437


   018
  -----
24|437
   24
  -----
   197
    24
  -----
     5

The number which is subject to division, 437, is called the dividend. 24 is the divisor, the result 18 is the quotient, and 5 is the remainder. Like when you file your taxes, you need to fill in profits you had gained from stock "dividends", which is a misnomer. What you fill into the tax form is a multiple of the quotient of a single huge chunk of dividend. You did not receive the dividend, but portions of dividend - otherwise, it would mean you owned 100% of the stock.

     ___________
11000|110110101



      000010010
     -----------
11000|110110101 
      11000
     ----------
      000110101 remainder=subtract divisor from dividend
       11000000 shift divisor right and append 0 to quotient until
        1100000 divisor is not greater than remainder.
         110000 Yihaa!
     ----------
         000101 remainder=subtract shifted divisor from remainder
          11000 shift divisor right and append 0 to quotient until
           1100 divisor is not greater than remainder.
     ----------
               oops, cannot shift anymore.

The above, as you might already know, is TRUE division. Which is achieved by subtracting by a shifted divisor.

What you want is to achieve the same thing by simply shifting the dividend. That, unfortunately cannot be done unless the divisor is a exponential power of 2 (2,4,8,16). Which is an obvious fact of binary arithmetic. Or, at least I am not aware of any method that can do it without approximation and intrapolative techniques.

Therefore, you have to use a combination of dividend shift and true division. e.g.

24 = 2 x 2 x 2 x 3

First, divide 437 by 8 using binary shift to get 010010 and then use true division to divide by 3:

   010010
  --------
11|110110
   11
   -------
     011
      11
     -----
        0

which works out to 010010 = 18.

Voila.

How do you determine 24 = 2^8 x 3?

By shifting 11000 rightwards until you hit a 1.

Which means, you could shift the dividend the same number of times as you would shift the divisor until the divisor hits a 1.

Therefore, obviously, this method would not work if a divisor is odd. e.g., it will not work for divisor 25, but it will work a little for divisor 50.

May be, there are predictive methods that could interpolate a divisor like 13 to be between 2^3=8 and 2^4=16. If there are, I am not familiar with them.

What you need to explore is using a number series. For example dividing by 25:

 1    1    1     1     1
__ = __ - ___ - ___ + ___ -  ... until the precision you require.
25   16   64    128   256

where the general form of the series is

1    1      b1              bn
_ = ___ + _______ + ... + ______
D   2^k   2^(k+1)         2^(k+n)

where bn is either -1, 0 or +1.

I hoping my binary manipulation above would not have errors or typos. If so, thousands apologies.

Blessed Geek
+2  A: 

If you are interested in the math behind it, read Hacker's Delight by Henry S. Warren.

If you are interested in optimized code, just write what is most easy to read by humans. For example:

int five_percent(int x) {
  return x / 20;
}

When you compile this function using g++ -O2, it will not do an actual division but some magic multiplication, bit-shifting and correction instead.

Roland Illig
+1  A: 

If you're curious how the optimizer does it (perhaps because you're trying to decompile some optimized code) take a look at http://www.flounder.com/multiplicative_inverse.htm. Comes with sample code you can play with.

Kate Gregory