views:

49

answers:

4

Let us assume we're generating very large (e.g. 128 or 256bit) numbers to serve as keys for a block cipher.

Let us further assume that we wear tinfoil hats (at least when outside).

Being so paranoid, we want to be sure of our available entropy, but we don't entirely trust any particular source. Maybe the government is rigging our coins. Maybe these dice are ever so subtly weighted. What if the hardware interrupts feeding into /dev/random are just a little too consistent? (Besides being paranoid, we're lazy enough that we don't want to generate it all by hand...)

So, let's mix them all up.

What are the secure method(s) for doing this? Presumably just concatenating a few bytes from each source isn't entirely secure -- if one of the sources is biased, it might, in theory, lend itself to such things as a related-key attack, for example.

Is running SHA-256 over the concatenated bytes sufficient?

(And yes, at some point soon I am going to pick up a copy of Cryptography Engineering. :))

+2  A: 

I've done this before, and my approach was just to XOR them, byte-by-byte, against each other.

Running them through some other algorithm, like SHA-256, is terribly inefficient, so it's not practical, and I think it would be not really useful and possibly harmful.

If you do happen to be incredibly paranoid, and have a tiny bit of money, it might be fun to buy a "true" (depending on how convinced you are by Quantum Mechanics) a Quantum Random Number Generator.

-- Edit:

FWIW, I think the method I describe above (or something similar) is effectively a One-Time Pad from the point of view of either sources, assuming one of them is random, and therefore unattackable assuming they are independant and out to get you. I'm happy to be corrected on this if someone takes issue with it, and I encourage anyone not taking issue with it to question it anyway, and find out for yourself.

Noon Silk
While I'll buy that it might be the most secure, your approach would require generating _n_ bits from each source for an _n_ -bit key, right? And could you clarify how SHA-256 would be "terribly inefficient"? Key generation is a one-time (or at least rare) event. Some information on the sort of harm possible from creating the key via hash would be nice, too...
Nicholas Knight
Nicholas: Model your threats. Consider this, hashing is a deterministic function. So what are you hoping it will achieve? Hide what from who? If it's just for generating keys, then your problem is key storage. Hashing the key doesn't make any difference if someone can still see it, does it? Re the first qs, I didn't notice this was for key generation, then yes, performance isn't an issue as I suppose you're doing it a lot, and yes, it would require the same number of bits from both sources (but you'd want that anyway, wouldn't you, else you have a component of data at higher "risk").
Noon Silk
WRT your edit, yeah, I'm pretty sure that's the equivalent of two one-time pads (which, if biased, are hopefully biased in different ways).
Nicholas Knight
@silky: My goal is really to mask localized bias. If the hashing function is secure, it should at least increase the difficulty of a related-key attack, since any mathematical relationship would be obscured, no? As for wanting the same number of bits, well, like I said, that's probably the most secure, but 128-256 coin flips and dice rolls would take a while. I'm going for tinfoil hat, not full-body faraday cage. :)
Nicholas Knight
@Nicholas: But hashing is completely deterministic. (Well, I suppose I'm assuming you're hashing data of size equal to or less than the hash output size). If your input size is greater than hash output, it's of arguable value. I don't know of any thoughts on that matter that I could discuss legitimately. I do see value in gathering 1kb of key data and then turning it into a 256 bit key. So if that's your plan, and you propose to do this via a hash, I support it.
Noon Silk
If you are paranoid, please remember that if you use this approach a clever attacker who control one of your sources might make it duplicate your other source, thus making the combined RNG return zeroes.
Rasmus Faber
+3  A: 

Using a hash function is a good approach - just make sure you underestimate the amount of entropy each source contributes, so that if you are right about one or more of them being less than totally random, you haven't weakened your key unduly.

This isn't dissimilar to the approach used in key stretching (though you have no need for multiple iterations here).

Nick Johnson
+1  A: 

If you have a source of randomness but you're not sure whether it is biased or not, then there are a lot of different algorithms. Depending on how much work you want to do, the entropy you waste from the original source differes.

The easiest algorithm is the (improved) van Neumann algorithm. You can find the details in this pdf: http://security1.win.tue.nl/~bskoric/physsec/files/PhysSec_LectureNotes.pdf at page 27.

I also recommend you to read this document if you're interested in how to produce uniformly randomness from a given souce, how true random number generators work, etc!

Henri
+4  A: 

Since you mention /dev/random -- on Linux at least, /dev/random is fed by an algorithm that does very much what you're describing. It takes several variously-trusted entropy sources and mixes them into an "entropy pool" using a polynomial function -- for each new byte of entropy that comes in, it's xor'd into the pool, and then the entire pool is stirred with the mixing function. When it's desired to get some randomness out of the pool, the entire pool is hashed with SHA-1 to get the output, then the pool is mixed again (and actually there's some more hashing, folding, and mutilating going on to make sure that reversing the process is about as hard as reversing SHA-1). At the same time, there's a bunch of accounting going on -- each time some entropy is added to the pool, an estimate of the number of bits of entropy it's worth is added to the account, and each time some bytes are extracted from the pool, that number is subtracted, and the random device will block (waiting on more external entropy) if the account would go below zero. Of course, if you use the "urandom" device, the blocking doesn't happen and the pool simply keeps getting hashed and mixed to produce more bytes, which turns it into a PRNG instead of an RNG.

Anyway... it's actually pretty interesting and pretty well commented -- you might want to study it. drivers/char/random.c in the linux-2.6 tree.

hobbs