strength-reduction

add vs mul (IA32-Assembly)

I know that add is faster as compared to mul function. I want to know how to go about using add instead of mul in the following code in order to make it more efficient. Sample code: mov eax, [ebp + 8] #eax = x1 mov ecx, [ebp + 12] #ecx = x2 mov edx, [ebp + 16] #e...

How can I strength reduce division by 2^n + 1?

I need to perform some integer divisions in the hot path of my code. I've already determined via profiling and cycle counting that the integer divisions are costing me. I'm hoping there's something I can do to strength reduce the divisions into something cheaper. In this path, I am dividing by 2^n+1, where n is variable. Essentially I w...