views:

309

answers:

3

I am developing a new web application using .NET 3.5 and SQL Server 2008 that will need to store some Social Security Numbers. I've been doing some initial reading on database encryption and it's a little bit confusing.

I would be nice to encrypt the SSNs using an asymmetric key, since that way the public facing application wouldn't be able to retrieve any of the data once it has been encrypted. I was figuring that only the admin interface would be able to decrypt and display the data. But it sounds like SQL Server only protects data using a symmetric key?

So, what is the best way to encrypt SSNs in SQL Server 2008? Bonus points if you link to a good tutorial or two.

+1  A: 

If you encrypt data then you must ask yourself who will decrypt it. If you use an asymmetric encryption system (e.g. RSA) then encryption uses the public key and decryption uses the corresponding private key; "asymmetry" comes from the fact that the private key cannot be recomputed from the public key (although both keys are mathematically linked together).

Asymmetric encryption tends to have overhead. A first remark is that such encryption must have some random part in it, because everybody can encrypt (the public key is, yeah, public): if the encryption process is deterministic, then anybody can encrypt all possible SSNs (there are less than a billion of them, which is a really small number for a modern computer) and match the encrypted values. Hence, there must be some random added during the encryption, and the encrypted SSN is larger than the plaintext SSN.

Known asymmetric encryption systems use mathematical structures which have their own cost. Basically, for the RSA encryption system, with a "strong enough" key, an encrypted message will have length at least 128 bytes. Some encryption systems do better; while sticking to the well-trodden paths of academic research, I could do it in 41 bytes or so (with El-Gamal over the NIST K-163 elliptic curve). Smaller seems harder.

So it is no wonder that a given database system would not include such a feature by default.

For your problem, you should first define (and write), as clearly as you can:

  • what is the data which you want to protect
  • who inputs the data
  • who is supposed to read the data back

and then only should you ask yourself whether encryption is a proper tool for that. Encryption is good when the envisioned attacker can get his hands on the raw, stored data. This means that the attacker has bypassed the operating system protections. At that point, whatever the operating system knows, the attacker also knows. If the database is hosted on a machine and there is an interface through which the decrypted SSN can be obtained, then that machine "knows" how to obtain the data, and so does the attacker... On the other hand, if the host machine operating system is considered resilient enough, then encryption does not seem to be needed at all.

Symmetric encryption on the database may address a weaker problem, in which the attacker gets a copy of the harddisk afterwards. The host system knows the symmetric encryption key, but it knows it in RAM only. An attacker stealing the harddisk will not have that key.

Thomas Pornin
To address the second paragraph, salting the SSNs sounds like a good idea. As for the 5th paragraph: The data I want to protect is the SSN. The users will input the data, but not retrieve it. Only administrators using a different application will be able to read the data back. As for your second to last paragraph, there are all sorts of security vulnerabilities that pop up with systems that don't necessarily require the attacker to have physical access to the machine. In either case, the private key would not be stored on the same server.
Shea Daniels
A: 

If you must store SSNs and you want them to be encrypted I recommend using a symmetric key encryption mechanism like 3DES or AES. Have the encryption key be a derivation of some pass phrase that only those authorized to access the data know and that they must input each time they access the data.

Ex: (10+ Character Pass Phrase) -> SHA-1 => KEY.

Don't bother relying on the database itself to do the encryption (although certainly look into features like TDE or whatever host OS you run support for full-disk or file encryption as a secondary over-all security mechanism), rather use the built in crypto libraries of .NET and whatever programming language you are using to read and write to the DB.

This gives you the advantage that you don't have to store a public and private key or generate those keys (which is computationally expensive) and they are relatively large so storage is more expensive (relatively), you also don't have to worry about the key being compromised when an unauthorized user gains access to the machine running your code (excluding MITM attacks occurring when a user is entering the pass phrase). Secondly, it ensures that when accessed the only way they can be decrypted is by a user who is authorized (knows the password). Depending on your budget (time, effort, resources) you can add multi-factor authentication where the encryption key is derived from both a pass phrase but also some token that authorized users would have like a smart card. Thirdly, encrypting and decrypting the data using a symmetric key encryption algorithm will be much much faster.

Harley Green
With a symmetric key don't you need the passphrase in order to encrypt the data? It would make me more comfortable not having anything that could be used to decrypt the data on the public side of things.
Shea Daniels
I assumed you would not have an API or mechanism to decrypt that data except from the controlled side when an authorized user is accessing your system.
Harley Green
It depends on what you mean by authorized user. All of this stuff should already be protected by the normal stored proc access controls, for instance. I want an extra level of security.
Shea Daniels
A: 

You really don't want to use asymmetric encryption because it's very slow. Rather you'll want to protect a symmetric key with an asymmetric key, and then keep the symmetric key handy. To be honest, I would stick with what's in SQL Server rather than designing things yourself. A really good start is here http://dotnetslackers.com/articles/sql/IntroductionToSQLServerEncryptionAndSymmetricKeyEncryptionTutorial.aspx

Michael Howard-MSFT
I think I understand the reason for doing that if you have a large amount of data to encrypt. I'm assuming that each piece of data would have it's own symmetric key that would then be encrypted by the shared public key, right? But in this case wouldn't the SSNs actually be smaller than the symmetric keys?
Shea Daniels