views:

336

answers:

3

hi

I am working on GPU device which has very high division integer latency, several hundred cycles. I am looking to optimize divisions.

All divisions by denominator which is in a set { 1,3,6,10 }, however numerator is a runtime positive value, roughly 32000 or less. due to memory constraints, lookup table may not be a good option.

Can you think of alternatives? I have thought of computing float point inverses, and using those to multiply numerator.

Thanks

PS. thank you people. bit shift hack is a really cool. to recover from roundoff, I use following C segment:

// q = m/n
q += (n*(j +1)-1) < m;
+5  A: 

The standard embedded systems hack for this is to convert an integer division by N into a fixed-point multiplication by 1/N.

Assuming 16 bits, 0.33333 can be represented as 21845 (decimal). Multiply, giving a 32-bit integer product, and shift down 16 bits.

You will almost certainly encounter some roundoff (truncation) error. This may or may not be something you can live with.

It MIGHT be worthwhile to look hard at your GPU and see if you can hand-code a faster integer division routine, taking advantage of your knowledge of the restricted range of the numerator.

John R. Strohm
truncation would be a problem, I need actual values. However I think I can cope with it by checking for roundoff and incrementing result if found
aaa
thank you. This is very useful for me.
aaa
+7  A: 
a/b=a*(1/b)
x=(1<<16)/b
a/b=(a*x)>>16

can you build a lookup table for the denominators? since you said 15 bit numerators, you could use 17 for the shifts if everything is unsigned 32 bit:

a/b=a*((1<<17)/b)>>17

The larger the shift the less the rounding error. You can do a brute force check to see how many times, if any, this is actually wrong.

drawnonward
yes, that is small enough. thanks
aaa
i do get roundoff error, but I have a way to recover correct result.Thank you
aaa
For roundoff error, you could try the classic add half before divide, which in this case would be a/b=(a*((1<<16)/b)+(1<<15))>>16
drawnonward
+3  A: 

The book, "Hacker's Delight" by Henry Warren, has a whole chapter devoted to integer division by constants, including techniques that transform an integer division to a multiply/shift/add series of operations.

This page calculates the magic numbers for the multiply/shift/add operations:

Michael Burr