This is (AFAIK) a specific question within this general topic.
Here's the situation:
I have an embedded system (a video game console) based on a 32-bit RISC microcontroller (a variant of NEC's V810). I want to write a fixed-point math library. I read this article, but the accompanying source code is written in 386 assembly, so it's neither directly usable nor easily modifiable.
The V810 has built-in integer multiply/divide, but I want to use the 18.14 format mentioned in the above article. This requires dividing a 64-bit int by a 32-bit int, and the V810 only does (signed or unsigned) 32-bit/32-bit division (which produces a 32-bit quotient and a 32-bit remainder).
So, my question is: how do I simulate a 64-bit/32-bit divide with a 32-bit/32-bit one (to allow for the pre-shifting of the dividend)? Or, to look at the problem from another way, what's the best way to divide an 18.14 fixed-point by another using standard 32-bit arithmetic/logic operations? ("best" meaning fastest, smallest, or both).
Algebra, (V810) assembly, and pseudo-code are all fine. I will be calling the code from C.
Thanks in advance!
EDIT: Somehow I missed this question... However, it will still need some modification to be super-efficient (it has to be faster than the floating-point div provided by the v810, though it may already be...), so feel free to do my work for me in exchange for reputation points ;) (and credit in my library documentation, of course).