Rewritten: This is the only improvement that I can see in squaring a n-bit number over multiplying two n-bit numbers together. It may not be asymptotically better in the O(n^2) vs. O(n) sort of way that is commonly used in computer science. However, if we take it asymptotically literally meaning the complexity that is approached (including the multiplicative constants), then this will fit that definition. Anyway, it's all that I can see to do so take it or leave it.
Let's say that we have two N-bit numbers, x
and y
. We can multiply them together (x*y
) with the shift-and-add method with A*N^2 + O(N) operations where A is a constant. The second term, the O(N) term, can be disregarded for large enough N so the number of operations is essentially A*N^2.
Now we calculate x^2
. If we define a
to have only the upper N/2 bits of x
set in it and b
to have only the lower N/2 bits of x
set in it, then
x = a + b
x^2 = (a + b)^2 = a^2 + b^2 + 2*a*b
However, remember that we can multiply a N-bit number with A*N^2 operations. To multiply a*a we only have to do A*(N/2)^2 = A*N/4 operations. The same goes for b*b and a*b. If we ignore the O(N) operations, then x^2 = (a + b)^2
is calculated in
A*N^2/4 + A*N^2/4 + A*N^2/4 = (3/4)*A*N^2
operations which is of course better than the standard A*N^2 for multiplying two arbitrary N-bit numbers by A*N^2/4. We can further improve on this by repeating the same operation with a^2
and b^2
. At some point it will not be beneficial to keep doing this. This is not an enormous improvement, but it's all that I can find. You can decide for yourselves if this counts or not.