Multiplying two binary numbers takes n^2 time, yet squaring a number can be done more efficiently somehow. (with n being the number of bits) How could that be?
Or is it not possible? This is insanity!
Multiplying two binary numbers takes n^2 time, yet squaring a number can be done more efficiently somehow. (with n being the number of bits) How could that be?
Or is it not possible? This is insanity!
Do you mean multiplying a number by a power of 2? This is usually quicker than multiplying any two random numbers since the result can be calculated by simple bit shifting. However, bear in mind that modern microprocessors dedicate lots of brute force silicon to these types of calculations and most arithmetic is performed with blinding speed compared to older microprocessors
The square root of 2n is 2n / 2 or 2n >> 1, so if your number is a power of two everything is totally simple once you know the power. To multiply is even simplier: 24 * 28 is 24+8. There's no sense in this statements you've done.
I believe you may be referring to exponentiation by squaring . This technique isn't used for multiplying, but for raising to a power x^n, where n may be large. Rather than multiply x times itself N times, one performs a series of squaring and adding operations which can be mapped to the binary representation of N. The number of multiplication operations (which are more expensive than additions for large numbers) is reduced from N to log(N) with respect to the naive exponentiation algorithm.
I have it!
2 * 2
is more expensive than
2 << 1
(The caveat being it only works for one case.)
There is apparently a shortcut for squaring binary numbers: Squaring circuit for binary numbers
It doesn't give a proof, but does give a fairly technical explanation of how to square binary numbers.
On the basis that such an invention exists, I would expect that it is indeed more efficient to square a binary number than to multiple two arbitrary binary numbers. However this does not constitute a proof!
I think that you are completely wrong in your statements
Multiplying two binary numbers takes n^2 time
Multiplying two 32bit numbers take exactly one clock cycle. On a 64 bit processor, I would assume that multiplying two 64 bit numbers take exactly 1 clock cycle. It wouldn't even surprise my that a 32bit processor can multiply two 64bit numbers in 1 clock cycle.
yet squaring a number can be done more efficiently somehow.
Squaring a number is just multiplying the number with itself, so that is just a simple multiplication. There is no "square" operation in the CPU.
Maybe you are confusing "squaring" with "multiplying by a power of 2". Multiplying by 2 can be implemeted by shifting all the bits one position to the "left". Multiplying by 4 is shifting all the bits two positions to the "left". By 8, 3 positions. But this trick only applies to a power of two.
Squaring an n digit number may be faster than multiplying two random n digit numbers. Googling I found this article. It is about arbitrary precision arithmetic but it may be relevant to what your asking. In it the authors say this:
In squaring a large integer, i.e. X^2 = (xn-1, xn-2, ... , x1, x0)^2 many cross-product terms of the form xi * xj and xj * xi are equivalent. They need to be computed only once and then left shifted in order to be doubled. An n-digit squaring operation is performed using only (n^2 + n)/2 single-precision multiplications.
If you have a binary number A, it can (always, proof left to the eager reader) be expressed as (2^n + B), this can be squared as 2^2n + 2^(n+1)B + B^2. We can then repeat the expansion, until such a point that B equals zero. I haven't looked too hard at it, but intuitively, it feels as if you should be able to make a squaring function take fewer algorithmical steps than a general-purpose multiplication.
If you assume fixed length to the word size of the machine and that the number to be squared is in memory, a squaring operation requires only one load from memory, so could be faster.
For arbitrary length integers, multiplication is typically O(N²) but there are algorithms which reduce this for large integers.
If you assume the simple O(N²) approach to multiply a by b, then for each bit in a you have to shift b and add it to an accumulator if that bit is one. For each bit in a you need 3N shifts and additions.
Note that
( x - y )² = x² - 2 xy + y²
Hence
x² = ( x - y )² + 2 xy - y²
If each y is the largest power of two not greater than x, this gives a reduction to a lower square, two shifts and two additions. As N is reduced on each iteration, you may get an efficiency gain ( the symmetry means it visits each point in a triangle rather than a rectangle ), but it's still O(N²).
There may be another better symmetry to exploit.
There exist algorithms more efficient than O(N^2) to multiply two numbers (see Karatsuba, Pollard, Schönhage–Strassen, etc.)
The two problems "multiply two arbitrary N-bit numbers" and "Square an arbitrary N-bit number" have the same complexity.
We have
4*x*y = (x+y)^2 - (x-y)^2
So if squaring N-bit integers takes O(f(N)) time, then the product of two arbitrary N-bit integers can be obtained in O(f(N)) too. (that is 2x N-bit sums, 2x N-bit squares, 1x 2N-bit sum, and 1x 2N-bit shift)
And obviously we have
x^2 = x * x
So if multiplying two N-bit integers takes O(f(N)), then squaring a N-bit integer can be done in O(f(N)).
Any algorithm computing the product (resp the square) provides an algorithm to compute the square (resp the product) with the same asymptotic cost.
As noted in other answers, the algorithms used for fast multiplication can be simplified in the case of squaring. The gain will be on the constant in front of the f(N), and not on f(N) itself.
First of all great question! I wish there were more questions like this.
So it turns out that the method I came up with is O(n log n) for general multiplication in the arithmetic complexity only. You can represent any number X as
X = x_{n-1} 2^{n-1} + ... + x_1 2^1 + x_0 2^0
Y = y_{m-1} 2^{m-1} + ... + y_1 2^1 + y_0 2^0
where
x_i, y_i \in {0,1}
then
XY = sum _ {k=0} ^ m+n r_k 2^k
where
r_k = sum _ {i=0} ^ k x_i y_{k-i}
which is just a straight forward application of FFT to find the values of r_k for each k in (n +m) log( n + m) time.
Then for each r_k you must determine how big the overflow is and add it up accordingly. For squaring a number this means O(n log n) arithmetic operations.
You can add up the r_k values more efficiently using the Schönhage–Strassen algorithm to obtain a O(n log n log log n) bit operation bound.
The exact answer to your question is already posted by Eric Bainville.
However, you can get a much better bound than O(n^2) for squaring a number simply because there exist much better bounds for multiplying integers!