views:

912

answers:

9

My problem is the following:

In an existing database I want to encrypt data in a couple of columns. The columns contains strings of different lengths.

I don't want to change the size of the columns so the encryption need to produce an equal length text representation of the input text.

The strength of the encryption algorithm is of secondary interest but of course I want it to be as strong as it can be. Otherwise I wouldn't need to encrypt the data. But the most important thing is the size of the output.

Is this possible? If so how would I do it?

I'm interested in doing it in .NET. No database-level encryption.

A: 
Suvesh Pratapa
"as it includes some knowledge about the password too" - umm... are you sure?
DrJokepu
Oops, sorry, I meant holds more information as a result of the password.
Suvesh Pratapa
A: 

The Vigenère cipher can do that. But it is old (pre-computer) and only secure if your key phrase is longer than the longest string you want to encrypt. Plus, having a database full of strings encrypted with the same key phrase will probably make this a week encryption, especially if the plain texts can be guessed.

It works more or less like the cesar shift algorithm (add n to each letter in plain text), except that n is different for each letter being changed, based on a key phrase.

If your key phrase is ABCDEFG, then it means n=1 for first letter of input, 2 for second letter of input etc.

With a random key phrase longer than the plain text, the output is just as random (secure). But I believe this will break down if you have many strings encrypted with the same key. ..

Daren Thomas
Vigenère cipher cipher is a block cipher where block length is the length of a character (1 byte usually) and each block can only be encrypted to a maximum of 26 possibilities, it also won't allow us to have lowercase and uppercase puntuation and special characters.One would suggest XOR stream cipher if one has secure key exchange of a key the length of a message (this is provably secure and XOR is far more effecient for digital circuitry to handle).
ewanm89
+1  A: 

Any block cipher will do. Essentially, you input a fixed length block and get a similar size encrypted block back. The cipher is a permutation from {0,...,2^blocklength} to {0,...,2^blocklength}. (The input length has to be padded to a block length boundary.)

The problem here is that if the columns are text, you cannot necessarily place binary cryptotext in them and you'll have to encode the data to a text format such as base64 (33% size increase).

AES is a block cipher standard that is widely available.

laalto
A: 

You could use rot13 =:)

Simon P Stevens
+2  A: 

You should take a minute and think of the real problem you're trying to solve. I've seen very few instances where database encryption was really nessecary, since information rarely flows directly from the database to an end user.

If you need to protect content of the database, then you should perhaps look into its standard access control mechanisms instead.

Christoffer
A: 

You might look for a tweakable block cipher. If your rows have a unique identifier (e.g. a primary key) then the unique identifier can be used as a tweak. The advantage of this kind of encryption is that you don't need any IV's to randomize the encryption. Even if a column contains the same value multiple times, this value gets encrypted differently, because of the tweak.

A less secure solution is to use a block cipher in counter mode and use the unique identifier to compute the counter. But this mode has a severe disadvantage: You can't securely reencrypt fields unless you also change the unique identifier.

Since both cases don't randomize the ciphertext, it is possible that an attacker can observe if a certain field has changed. This might leak some valuable information. Also note that neither case gives you any data integrity. Even if an attacker can not decrypt information, he might still be able to change it to his advantage.

Accipitridae
+1  A: 

Ideally, if the existing columns are larger than a single block in a standard block cipher (16 bytes for AES, 8 bytes for TDES), then you could encrypt in CTS (cipher text stealing) mode. Unfortunately, .net does not support CTS in any of its included algorithms. :-(

Normally CTS uses a random IV that would have to be stored along with the ciphertext, but you can just use the row ID or even a constant value if you don't mind identical plaintext values encrypting to identical ciphertext.

Theran
The cipher mode CTS is not implemented in .NET I'm afraid...
Christian80
@Christian80 Turns out you're right. Why Microsoft would add it to the enumeration and not bother ever implementing it is a mystery to me.
Theran
+5  A: 

Within your constrants, I would use AES in CFB mode, which turns it into a stream cipher and the output length will be the same as the input length. Unless you're storing the strings in blobs, you'll need to hex or base64 encode the output to make it char friendly, which will be a 100% or 33% increase in length.

One .NET implementation is here.

jkf
Out of all the numerous suggestions here, jkf is the only one who is answering the question properly.. indeed, you need a stream cypher, NOT a block cypher, and the AES stream cypher is very likely the best to choose from. The key requirement by the poster is that the data, with variable length, takes up the same amount of space before and after encryption, and you can do exactly that with a stream cypher. +10 to JKF if I could.
SPWorley
BTW, an important point, make sure that each column uses a different key to seed the encryption, to avoid any information leak by xoring neighbors together. Seeding with <key> <key+1> <key+2> and so on is fine.
SPWorley
well, all block ciphers in CFB mode act as stream ciphers.
ewanm89
+3  A: 

Secure encryption requires that the ciphertext be larger than the plaintext; otherwise identical plaintext always results in identical ciphertext, and there's no such thing as an invalid ciphertext, which are both weaknesses.

However, if you really can't expand the data you're encrypting, the best you can do is a tweakable block mode. Look up XTS and CMC modes which are used for disk encryption.

Paul Crowley
This is misleading. Ciphertext need be no larger than plaintext with any common algorithm (exception: with block algorithms the plaintext may need to be padded to a multiple of the block length). A given plaintext (and key) must always give the same ciphertext or you wouldn't be able to decrypt. I think what you're getting at is the salting of data to prevent known plaintext attacks on the key ... but that's not how it came out.
dajames
Encryption, though, does typically give *binary* ciphertext and if you need a text representation (e.g. to put in a text column in a database) it will need to be represented in a text format, and *that* will increase the size. That's not a security consideration, though, just a consequence of the fact that some binary values are not valid text characters.
dajames