I am currently writing a fast 32.32 fixed-point math library. I succeeded at making adding, subtraction and multiplication work correctly, but I am quite stuck at division.
A little reminder for those who can't remember: a 32.32 fixed-point number is a number having 32 bits of integer part and 32 bits of fractional part.
The best algorithm I came up with needs 96-bit integer division, which is something compilers usually don't have built-ins for.
Anyway, here it goes:
G = 2^32
notation: x is the 64-bit fixed-point number, x1 is its low nibble and x2 is its high
G*(a/b) = ((a1 + a2*G) / (b1 + b2*G))*G // Decompose this
G*(a/b) = (a1*G) / (b1*G + b2) + (a2*G*G) / (b1*G + b2)
As you can see, the (a2*G*G)
is guaranteed to be larger than the regular 64-bit integer. If uint128_t's were actually supported by my compiler, I would simply do the following:
((uint128_t)x << 32) / y)
Well they aren't and I need a solution. Thank you for your help.