views:

530

answers:

8

I need a very, very fast one-to-one algorithm. The algorithm doesn't need to be unbreakable. Reasonably strong is enough but it must be lightning fast. I will be implementing it in hardware. Area is a concern, too, so it shouldn't use too much logic.

It should be a function f_N(x) whose input is an N-bit number and whose output is an N-bit number. N is a constant, probably between 20-70. The function must be one-to-one. (ie invertible, meaning that decryption is possible. Decryption speed is irrelevant.)

I need to encrypt in under 3ns, which is about 333M inputs per second. DES, for example, does about 50Mbits per second. I need 333M inputs per second.

So far I've been using a Feistel cipher with about 6 rounds. That seems to take about 3ns.

Suggestions?

More notes

There have been some questions so I'll explain. I need to put keys into a hash table. The standard method is to hash the input key and use the result as an index into a table. Each row in the table must store the original key. Information theory tells us that the rows of the table don't actually need to be as wide as the input key, but rather as wide as the input key less the number of bits in the address of the table. For example:

  • input: x (N bits)
  • hash: x%128 (8 bits)
  • verifier: floor(x/128) (N-8 bits)

It would be silly on a CPU where integers are usually the same width but I'm doing it in hardware.

x%128 is an easy hash to break. In fact, if the input keys only differ in the first few bits, you will have broken the hash on accident. I want a hash that won't be broken on accident and might even be difficult to break on purpose. I also tried an LFSR. LFSRs are fast but two LFSRs of equal length generate hash results that are correlated linearly. (If f(x) and g(x) give the same hash for two different polynomials, f(x+1) and g(x+1) are easily correlated.)

So, I need a function with N-bit input and V-bit,H-bit output (V+H=N) where it is difficult to find two inputs of length N such that both will output the same H. Encryption fits the bill in that it leaves the output the same length as the input and it is difficult to reverse. Something other than encryption might work, too, though it seems like what I want is almost the very definition of encryption.

Sorry about not explaining all this up-front. Hope that this clarifies things.

+5  A: 

I am wondering if you are not concerned about the strength of the encryption then perhaps you don't need to be encrypting at all. The most important part of an encryption algorithm is is it strength. If the encryption is weak then you are not doing any good by encrypting in the first place.

Andrew Hare
That's true, I'm not really encrypting. What I need is a function that maps from N-bits to N-bits. The function must be one-to-one and it should be difficult guess two inputs that will produce outputs that differ by just one or two bits, kind of like the restriction on a hash function.
Eyal
Your statement isn't actually true. Information is sufficiently encrypted if the total loss incurred when it's broken does not outweigh the costs to break it, for now and the future. With this many inputs per second, the costs to decrypt the entire stream will be significant in any case.
MSalters
I completely disagree with Andrew. Encryption is always a trade off between effort(cpu time, key distribution, etc) and strength. Depending on the engineering problem, one might want a very strong algorithm, where cpu time is not a factor, or the very fast, but not very strong algorithm.
e5
+2  A: 

I would recommend my old friend, the Tiny Encryption Algorithm

It's both fast and extremely low on footprint, which you probably also have to consider when implementing in hardware.

sharkin
The code in Wikipedia for TEA indicates that the time for TEA is 32 times:1 32-bit addition4 additions, 2 constant shifts and 2 xorsagain 4 additions, 2 constant shifts and 2 xorsIn hardware much of this can be parallized so in all the delay would be about: 32 times (32-bit adder + 32-bit xor + 32-bit adder + 32-bit xor). Approximately 64 adders and 64 xor in serial.In 3ns there is probably time enough for 3 32-bit adders in todays hardware, about 20 times faster than what TEA can do.
Eyal
I agree with Eyal. TEA is not suitable. If the time is the main concern then one has to look for ciphers have a small number of rounds, but that can do many operations in parallel in one round.
Accipitridae
Is it bad enough that I should remove the answer?
sharkin
A: 

XOR is your way out :)

Roman
ummmm.... elaborate?
Jason S
A: 

What's wrong with using a 64-bit "key" value and xor-ing each byte? You can cycle through the key as many times as needed to xor any plaintext bits after the first 64.

A 64-bit key could be an 8-char password or an 8-byte hash of a passphrase.

Since the key has almost as many bits as the message, it will actually be rather strong and extremely fast.

S.Lott
This is essentially the same form as a Vigenere cipher, which has weaknesses. http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher Extremely fast, yes. Extremely strong, no.
Jason S
That scheme works great for messages of 64 bits. Messages of 128 bits you are in trouble since you can xor the first 64 bits with the second 64 bits of the ciphertext and get the xor of the two plaintexts.
e5
@ethan: Correct -- but since the message is promised to be no more than 70 bits, that attack doesn't work out very well.
S.Lott
This is a terrible approach. Because there's no mixing, two inputs with the same prefix will have the same output prefix, violating one of his requirements.
Nick Johnson
Violating one of the *revised* requirements. The new problem is very little like the original problem.
S.Lott
"but since the message is promised to be no more than 70 bits, that attack doesn't work out very well." And so you're assuming there's only one message?
Jason S
@Jason: it isn't clear from the message that anyone sees the hashed message and has a chance to reverse engineer the key. It appears that "encryption" here is merely being used to flatten out the hashing in a reliable way, not actually implement secrecy.
S.Lott
+2  A: 

Here are some benchmarks with some algorithms: http://gd.tuwien.ac.at/privacy/crypto/libs/cryptlib/benchmarks.html

Note that these benchmarks are testing implementations of algorithms, so it might not be what you are looking for.

soulmerge
A: 

The following does not satisfy your requirement of a one-to-one function, but perhaps it might be of use if speed is paramount. (If it won't work, then I'd suggest a divide and conquer route: you are working in hardware, so theoretically you should be able to encrypt and decrypt in parallel, unless one input's encryption is dependent on previous inputs' encryptions.)

Just about the fastest hardware algorithm for what I would call "munging" is to treat your inputs conceptually as a bitstream, and XOR them with the bitstream output of a cryptographically secure and reconstructable bit generator -- reconstructable if you wish to decrypt it. One example which is simple and lightning-fast, but by itself is not secure, is a linear feedback shift register (LFSR). Choose a long period (2128 - 1 or 2256 - 1 or something like that). The wikipedia page suggests modifications to enhance security. You could also try occasionally (e.g. once every M bits where M = 4096, 16384, 65536, whatever) XORing the state of the LFSR with the output of a slower but more secure bitstream (either a stream cipher, or a block cipher encrypting a predetermined set of inputs, e.g. an incrementing sequence, or a delayed snapshot of the LFSR state) -- though this falls into the don't-invent-your-own cryptographic technique, the idea being that well-known cryptographic techniques have had large amounts of energy invested on testing whether they have vulnerabilities.

Jason S
+1  A: 

When you say "fast" do you care only about throughput, or is latency itself of the highest importance?

If latency is not quite as important as throughput, is there any reason why you can't use a standard Feistel cipher that is known to be secure, and instead of having the full number of rounds (e.g. like 16 in Blowfish) output from combinational logic, stick a register in between each round, so that you pipeline the encryption algorithm? It would essentially require the same amount of hardware (a little bit more to add some flip-flops for registers) as a known secure encryption algorithm, but the propagation delay would only be that of one round of the Feistel network + the propagation delay of the flip-flops.

Jason S
That's pretty much what I'm doing so far, but blowfish and friends have fixed blocksizes. Also, all the registers are expensive because expanding the pipeline expands other pipelines in the chip waiting for the results. You're right, though, it would make things much faster at the expense of registers.
Eyal
This is what I did in the end. I randomly generated a Feistel cipher with a number of rounds that just fit in my time requirements.
Eyal
A: 

If you are doing this in hardware why don't you just use a standard block cipher on a DSP?

How Well Are High-End DSPs Suited for the AES Algorithms?

Matthew Whited