views:

548

answers:

3

Assuming you have a machine instruction udive that does a special case 64 by 32 unsigned division by taking a (32bit dividend << 32) / 32bit divisor, we can do a full 64 by 32 division using the following:

// assume: a / b guaranteed not to overflow
a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
b = 32bit divisor

q1 = udive(a.h, b) // (a.h << 32) / b
r1 = -(q1 * b)  // remainder of the above, shortcut since a.h & 0xffffffff == 0
q2 = a.l / b  // a.l / b using regular unsigned division
r2 = a.l - (q2 * b) // remainder of the above
q = q1 + q2
r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
 q = q + 1
return q

However the signed case is giving me problems. Assuming an equivalent sdive instruction that does the signed version of udive, I can't quite work out how to deal with the remainders and whatnot.

A: 

I think your unsigned code would be easier to read if it was explicit about which variables were 32 bit, which were 64 bit and whether the comparisons were signed or unsigned.

The book Hacker's Delight is generally good for this sort of low-level arithmetic stuff. I don't have a copy to hand at the moment, but its code for doing 64/64->64 given 64/32->32 is online: http://www.hackersdelight.org/HDcode/newCode/divDouble.c

That does the signed case by simply taking the absolute values of the inputs, doing an unsigned division and then flipping the sign of the result bit if the inputs had different sign. This suggests to me that this is probably the best approach (it's certainly easier to prove correct than the alternative). You might need to special case the dividend being the minimum possible integer if it doesn't just fall out in the wash.

A: 

Hi, thanks for answering. I've looked at Hacker's Delight. If you look at the divdi3 function in HD there's a call to DIVS, the 64/32->32 instruction, when the divisor is 32-bit and the result is known not to overflow. The machine I'm on doesn't have a general purpose 64/32->32 instruction, it has the special case mentioned above. The above function for 64/32->32 is my implementation for DIVU in the unsigned case, so I'm trying to work out something similar for DIVS.

I could just forget about the DIVS path, but that's the overwhelmingly common case and I want to hit it as much as possible, it's just a tricky implementation.

Sorry for the unclear pseudo code, but everything is unsigned and mostly 32bit.

DIVU(uint64 a, uint32 b) {
// assume: a / b guaranteed not to overflow
// a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
// b = 32bit divisor

uint32 q1 = udive(a.h, b)      // (a.h << 32) / b
uint32 r1 = -(q1 * b)          // remainder of the above, shortcut for (a.h << 32) - (q1 * b) since (a.h << 32) & 0xffffffff == 0
uint32 q2 = a.l / b            // a.l / b using regular unsigned division
uint32 r2 = a.l - (q2 * b)     // remainder of the above
uint32 q = q1 + q2
uint32 r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
        q = q + 1
return q
}
A: 

You can ignore divdi3's special optimisation case of calling divs; the thing I wanted to draw attention to was the fact that when divdi3 needs to do full-strength division it does it by calling udivdi3 rather than by having a signed-division equivalent to the udivdi3 algorithm.

Looking in Knuth vol 2 I see that he also basically says that you do signed division by the process of taking absolute values, doing an unsigned divide and then fixing up the sign bit afterwards. This makes intuitive sense to me, because signed 2s complement numbers don't have the convenient property that a == (a.h * 2^32) + a.l that unsigned numbers do, so it's not as easy to assemble 64 bit operations by operating on the two halves separately.

The before-and-after fiddling shouldn't be that many extra cycles over the unsigned divide...

PS: what weirdo CPU is this, anyway? :-)