views:

63

answers:

4

Say I have a scheme that derives a key from N different inputs. Each of the inputs may not be completely secure (f.x. bad passwords) but in combination they are secure. The simple way to do this is to concatenate all of the inputs in order and use a hash as a result.

Now I want to allow key-derivation (or rather key-decryption) given only N-1 out of the N inputs. A simple way to do this is to generate a random key K, generate N temporary keys out of different N subsets of the input, each with one input missing (i.e. Hash(input_{1}, ..., input_{N-1}), Hash(input_{0}, input_{2}, ..., input_{N-1}), Hash(input_{0}, input_{1}, input_{3},..., input_{N-1}), ..., Hash(input_{0}, ..., input_{N-2})) then encrypt K with each of the N keys and store all of the results.

Now I want to a generalized solution, where I can decrypt the key using K out of N inputs. The naive way to expand the scheme above requires storing (N choose N-K) values, which quickly becomes infeasible.

Is there a good algorithm for this, that does not entail this much storage?

I have thought about a way to use something like Shamir's Secret Sharing Scheme, but cannot think of a good way, since the inputs are fixed.

A: 

One approach would be to generate a purely random key (or by hashing all of the inputs, if you want to avoid an RNG for some reason), split it using a k-of-n threshold scheme, and encrypt each share using the individual password inputs (eg send them through PBKDF2 with 100000 iterations and then encrypt/MAC with AES-CTR/HMAC). This would require less storage than storing hash subsets; roughly N * (share size + salt size + MAC size)

Jack Lloyd
I had considered something like this. But consider an extreme case: N=256 with each input only containing one bit. Then the concatenate-and-hash scheme has 256-bits security, and I would expect a 254/256 treshold scheme to have 254-bits security. But I can break this scheme by a divide-and-conquer approach: first try the two possible values for the first input, then the two possible values for the second input etc. and thus break the system using at most 2*254*100000 iterations of PBKDF2.
Rasmus Faber
(But perhaps the system can be salvaged by not including MACs? It would require that the components of the split-key are indistinguishable from random data, but that should be true AFAIK).
Rasmus Faber
This possibility definitely makes the problem more interesting. I assume for each input you can attribute it or in some way order them, and know which ones you have and which ones are missing? This is implicit in your hash solution, but wanted to confirm this is possible in your design.
Jack Lloyd
A: 

Rather than simply allowing a few errors out of a large number of inputs, you should divide the inputs up into groups and allow some number of errors in each group. If you were to allow 4 errors out of 64 inputs then you would have to have 15,249,024 encrypted keys, but if you break that up into two groups of 32, allowing two errors per group then you would only need to have 1984 encrypted keys.

Once you have decrypted the key information from each group then use that as input into decrypting key that you ultimately want.

Also, the keys acquired from each group must not be trivial in comparison to the key that you ultimately want. Do not simply break up a 256 bit key into 8 32bit key pieces. Doing this would allow someone that could decrypt 7 of those key pieces to attempt a bruteforce attack on the last piece. if you want access to a 256 bit key, then you must work with 256 bit keys for the whole procedure.

Lunatic Experimentalist
+1  A: 

Error Correcting Codes are the most direct way to deal with the problem. They are not, however, particularly easy to implement.

The best approach would be using a Reed Solomon Code. When you derive the password for the first time you also calculate the redundancy required by the code and store it. When you want to recalculate the key you use the redundancy to correct the wrong or missing inputs.

Giacomo Verticale
A: 

To encrypt / create:

Take the N inputs. Turn each into a block in a good /secure way. Use Reed Solomon to generate M redundancy blocks from the N block combination. You now have N+M blocks, of which you need only a total of N to generate the original N blocks.

Use the N blocks to encrypt or create a secure key.

If the first, store the encrypted key and the M redundancy blocks. If the second, store only the M redundancy blocks.

To decrypt / retrieve:

Take N - R correct input blocks, where R =< M. Combine them with the redundancy blocks you stored to create the original N blocks. Use the original N blocks to decrypt or create the secure key.

(Thanks to http://stackoverflow.com/users/492020/giacomo-verticale : This is essentially what he/she said, but I think a little more explicit / clearer.)

Slartibartfast