views:

153

answers:

4

Best practice is to use unique ivs, but what is unique? Is it unique for each record? or absolutely unique (unique for each field too)?

If it's per field, that sounds awfully complicated, how do you manage the storage of so many ivs if you have 60 fields in each record.

A: 

Looks like the question had low views, and since I have to move on with what I'm doing, I arrived at some logical solution.

I'm going to use the same iv for all fields in a single record. I rationed that if a separate iv is needed for each piece of data (i.e. field), their storage will be too much. I could come up with a custom algorithm to reduce the need for storage, but now their management becomes too much.

Hope this helps you if you have the same question.

Chris
Sorry, this is a weak solution. Since you indent to not encrypt all the fields at the same time, this means that the IV is known when you encrypt more fields later. There are some attacks that can exploit this.
Accipitridae
+3  A: 

I started an answer a while ago, but suffered a crash that lost what I'd put in. What I said was along the lines of:

It depends...

The key point is that if you ever reuse an IV, you open yourself up to cryptographic attacks that are easier to execute than those when you use a different IV every time. So, for every sequence where you need to start encrypting again, you need a new, unique IV.

You also need to look up cryptographic modes - the Wikipedia has an excellent illustration of why you should not use ECB. CTR mode can be very beneficial.

If you are encrypting each record separately, then you need to create and record one IV for the record. If you are encrypting each field separately, then you need to create and record one IV for each field. Storing the IVs can become a significant overhead, especially if you do field-level encryption.

However, you have to decide whether you need the flexibility of field level encryption. You might - it is unlikely, but there might be advantages to using a single key but different IVs for different fields. OTOH, I strongly suspect that it is overkill, not to mention stressing your IV generator (cryptographic random number generator).

If you can afford to do encryption at a page level instead of the row level (assuming rows are smaller than a page), then you may benefit from using one IV per page.


Erickson wrote:

You could do something clever like generating one random value in each record, and using a hash of the field name and the random value to produce an IV for that field.

However, I think a better approach is to store a structure in the field that collects an algorithm identifier, necessary parameters (like IV) for that parameter, and the ciphertext. This could be stored as a little binary packet, or encoded into some text like Base-85 or Base-64.

And Chris commented:

I am indeed using CBC mode. I thought about an algorithm to do a 1:many so I can store only 1 IV per record. But now I'm considering your idea of storing the IV with the ciphertext. Can you give me more some more advice: I'm using PHP + MySQL, and many of the fields are either varchar or text. I don't have much experience with binary in the database, I thought binary was database-unfriendly so I always base64_encoded when storing binary (like the IV for example).

To which I would add:

  • IBM DB2 LUW and Informix Dynamic Server both use a Base-64 encoded scheme for the character output of their ENCRYPT_AES() and related functions, storing the encryption scheme, IV and other information as well as the encrypted data.

  • I think you should look at CTR mode carefully - as I said before. You could create a 64-bit IV from, say, 48-bits of random data plus a 16-bit counter. You could use the counter part as an index into the record (probably in 16 byte chunks - one crypto block for AES).

  • I'm not familiar with how MySQL stores data at the disk level. However, it is perfectly possible to encrypt the entire record including the representation of NULL (absence of) values.

  • If you use a single IV for a record, but use a separate CBC encryption for each field, then each field has to be padded to 16 bytes, and you are definitely indulging in 'IV reuse'. I think this is cryptographically unsound. You would be much better off using a single IV for the entire record and either one unit of padding for the record and CBC mode or no padding and CTR mode (since CTR does not require padding - one of its merits; another is that you only use the encryption mode of the cipher for both encrypting and decrypting the data).

Jonathan Leffler
Thanks for the answer. I have to do field-level because not all fields will be available when the record is being created. Some values/fields will be added later as they become available.
Chris
Also thanks for retyping that answer despite the crash, I appreciate it :)
Chris
+2  A: 

The requirements for IV uniqueness depend on the "mode" in which the cipher is used.

For CBC, the IV should be unpredictable for a given message.

For CTR, the IV has to be unique, period.

For ECB, of course, there is no IV. If a field is short, random identifier that fits in a single block, you can use ECB securely.

I think a good approach is to store a structure in the field that collects an algorithm identifier, necessary parameters (like IV) for that algorithm, and the ciphertext. This could be stored as a little binary packet, or encoded into some text like Base-85 or Base-64.

erickson
Hi erickson, I am indeed using CBC mode. I thought about an algorithm to do a 1:many so I can store only 1 iv per record. But now I'm considering your idea of storing the ivs with their ciphertext. Can you give me more some more advice: I'm using PHP + mysql, and many of the fields are either varchar or text. I don't have much experience with binary in the database, I thought binary was database-unfriendly so I always base64_encoded when storing binary (like the iv for example). Thanks.
Chris
For CTR-mode the counters should have non overlapping ranges.Choosing unique IVs is not enough.
abc
You might want to rethink the hash based proposal. The OP does not intend to encrypt all fields at the same time. Once the random value is known it is possible to derive all IVs. If someone is encrypting some values first and more later, this means that the IVs are predictable.
Accipitridae
It doesn't matter if the IVs are known. Timestamps can be a good IV for CBC mode. The concern is whether the IV is a function of the message. Again, you've penalized me for your own pedantry and misinterpretation.
erickson
Timestamps are predictable, and hence not a good IV for CBC mode. Your proposal leads to predictable IVs and hence is also not a good proposal. It might have been an acceptable solution if the OP encrypted all the fields at the same time, but he clearly states that he does not intend to do so.I have not misread anything. You designed your own scheme, and it turned out to be flawed. For that you certainly deserved a down vote.
Accipitridae
You don't know what you are talking about. Cite an authority that says timestamps are a poor choice for CBC IVs.
erickson
No. You cite an competent authority that accpts timestamps as IV for CBC. NIST doens't.Btw, your hashing proposal might be acceptable if you'd explicitly encrypt the random that you use to generate the IVs or when you'd use the hash results as a nonce and encrypt them to get the IV. Of course one will still have to ensure that each IV is only used once.
Accipitridae
*Applied Cryptography*, 2nd. ed. p. 194. Furthermore, NIST (http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf) says, "In particular, *for any given plaintext,* it must not be possible to predict the IV that will be associated to the plaintext in advance of the generation of the IV." Predictability refers to predictability given the plaintext. I can predict the IV chosen from a PRNG, given the state of the generator. That isn't what "unpredictable" means in this context.
erickson
That's not a valid argument. Your references only support the first two sentences of your answer, but not the contested claims. In particular, NIST says nothing about timestamps and they say nothing about your self-made hashing scheme. That's not surprising since your construction is weak, exactly because it violates the requirement that you cite. Once some fields in a record are encrypted it is possible to predict further IV for other fields in advance. This is a flaw in your scheme and should be corrected.
Accipitridae
The references do support my position. Quoting Schneier on CBC initialization vectors: "A timestamp often makes a good IV. Otherwise, use some random bits from someplace." The NIST reference explains very clearly the qualification of the term "unpredictable." If they intended IVs to be absolutely unpredictable, without qualification, only a truly random process like nuclear decay could produce a secure IV. Instead, they specify that the IV cannot be predictable *given the plaintext.* This leads to unique ciphertexts, preventing replay and code-book building.
erickson
"Applied crypto" isn't a reliable reference in the first place. There has also been some developement after "Applied crypto" has been published, making Schneiers claim irrelevant. I've added more explanations to my answer. If there should be any further questions. please ask them there.
Accipitridae
+2  A: 

Once again, appendix C of NIST pub 800-38 might be helpful. E.g., according to this you could generate an IV for the CBC mode simply by encrypting a unique nonce with your encryption key. Even simpler if you would use OFB then the IV just needs to be unique.


There is some confusion about what the real requirements are for good IVs in the CBC mode. Therefore, I think it is helpful to look briefly at some of the reasons behind these requirements.

Let's start with reviewing why IVs are even necessary. IVs randomize the ciphertext. If the same message is encrypted twice with the same key then (but different IVs) then the ciphertexts are distinct. An attacker who is given two (equally long) ciphertexts, should not be able to determine whether the two ciphertexts encrypt the same plaintext or two different plaintext. This property is usually called ciphertext indistinguishablility. Obviously this is an important property for encrypting databases, where many short messages are encrypted.

Next, let's look at what can go wrong if the IVs are predictable. Let's for example take Ericksons proposal:

"You could do something clever like generating one random value in each record, and using a hash of the field name and the random value to produce an IV for that field."

This is not secure. For simplicity assume that a user Alice has a record in which there exist only two possible values m1 or m2 for a field F. Let Ra be the random value that was used to encrypt Alice's record. Then the ciphertext for the field F would be

EK(hash(F || Ra) xor m).

The random Ra is also stored in the record, since otherwise it wouldn't be possible to decrypt. An attacker Eve, who would like to learn the value of Alice's record can proceed as follows: First, she finds an existing record where she can add a value chosen by her. Let Re be the random value used for this record and let F' be the field for which Eve can submit her own value v. Since the record already exists, it is possible to predict the IV for the field F', i.e. it is

hash(F' || Re).

Eve can exploit this by selecting her value v as

v = hash(F' || Re) xor hash(F || Ra) xor m1,

let the database encrypt this value, which is

EK(hash(F || Ra) xor m1)

and then compare the result with Alice's record. If the two result match, then she knows that m1 was the value stored in Alice's record otherwise it will be m2. You can find variants of this attack by searching for "block-wise adaptive chosen plaintext attack" (e.g. this paper). There is even a variant that worked against TLS.

The attack can be prevented. Possibly by encrypting the random before using putting it into the record, deriving the IV by encrypting the result. But again, probably the simplest thing to do is what NIST already proposes. Generate a unique nonce for every field that you encrypt (this could simply be a counter) encrypt the nonce with your encryption key and use the result as an IV.

Also note, that the attack above is a chosen plaintext attack. Even more damaging attacks are possible if the attacker has the possibility to do chosen ciphertext attacks, i.e. is she can modify your database. Since I don't know how your databases are protected it is hard to make any claims there.

Accipitridae
Good exposition. Thanks for taking the time to provide an explanation.
erickson