views:

527

answers:

4

My understanding is that many public key cryptographic algorithms these days depend on large prime numbers to make up the keys, and it is the difficulty in factoring the product of two primes that makes the encryption hard to break. It is also my understanding that one of the reasons that factoring such large numbers is so difficult, is that the sheer size of the numbers used means that no CPU can efficiently operate on the numbers, since our minuscule 32 and 64 bit CPUs are no match for 1024, 2048 or even 4096 bit numbers. Specialized Big Integer math libraries must be used in order to process those numbers, and those libraries are inherently slow since a CPU can only hold (and process) small chunks (like 32 or 64 bits) at one time.

So...

Why can't you build a highly specialized custom chip with 2048 bit registers, and giant arithmetic circuits, much in the same way that we scaled from 8 to 16 to 32 to 64-bit CPUs, just build one a LOT larger? This chip wouldn't need most of the circuitry on conventional CPUs, after all it wouldn't need to handle things like virtual memory, multithreading or I/O. It wouldn't even need to be a general-purpose processor supporting stored instructions. Just the bare minimum to perform the necessary arithmetical calculations on ginormous numbers.

I don't know a whole lot about IC design, but I do remember learning about how logic gates work, how to build a half adder, full adder, then link together a bunch of adders to do multi-bit arithmetic. Just scale up. A lot.

Now, I'm fairly certain that there is a very good reason (or 17) that the above won't work (since otherwise one of the many people smarter than I am would have already done it) but I am interested in knowing why it won't work.

(Note: This question may need some re-working, as I'm not even sure yet if the question makes sense)

+2  A: 

Its because this speedup would be only in O(n), but the complexity of factoring the number is something like O(2^n) (with respect to the number of bits). So if you made this überprocessor and factorized the numbers 1000 times faster, I would only have to make the numbers 10 bits larger and we would be back on the start again.

cube
Actually, the complexity of factoring is not quite exponential (O(2^n)); there are sub-exponential algorithms. But it's still very slow. See http://en.wikipedia.org/wiki/Integer_factorization#Difficulty_and_complexity for all the gory math :-).
sleske
+2  A: 

As indicated above, the primary problem is simply how many possibilities you have to go through to factor a number. That being said, specialized computers do exist to do this sort of thing.

The real progress for this sort of cryptography is improvements in number factoring algorithms. Currently, the fastest known general algorithm is the general number field sieve.

Historically, we seem to be able to factor numbers twice as large each decade. Part of that is faster hardware, and part of it is simply a better understanding of mathematics and how to perform factoring.

Christopher
+3  A: 

What @cube said, and the fact that a giant arithmetic logic unit would take more time for the logic signals to stabilize, and include other complications in digital design. Digital logic design includes something that you take for granted in software, namely that signals through combinational logic take a small but nonzero time to propagate and settle. A 32x32 multiplier needs to be designed carefully. A 1024x1024 multiplier would not only take a huge amount of physical resources in a chip, but it also would be slower than a 32x32 multiplier (though perhaps faster than a 32x32 multiplier computing all the partial products needed to perform a 1024x1024 multiply). Plus it's not only the multiplier that's the bottleneck: you've got memory pathways. You'd have to spend a bunch of time gathering the 1024 bits from a memory circuit that's only 32 bits wide, and storing the resulting 2048 bits back into the memory circuit.

Almost certainly it's better to get a bunch of "conventional" 32-bit or 64-bit systems working in parallel: you get the speedup w/o the hardware design complexity.

edit: if anyone has ACM access (I don't), perhaps take a look at this paper to see what it says.

Jason S
+1  A: 

I can't comment on the feasibility of an approach exactly like the one you described, but people do similar things very frequently using FPGAs:

Mark Rushakoff