We have a requirement from customer that if someone gets access to the database, all data that includes personal information should be encrypted, so that when they do select calls, they shouldn't be able to see anything in clear text. Now this isn't any problem for Strings, but what about bytearrays? (that can potentially be quite huge (several 100mb))
When you do a select call, you get gibberish anyways. Is it possible for a hacker to somehow read the bytes and get the sensitive information without knowing how the structure of the object it is mapped against is?
Because if that is the case, then I guess we should encrypt those bytes, even if they can potentially be quite huge. (I am guessing adding encryption will make them even bigger)
views:
42answers:
2This seems to be an approach that will give you little additional security for a large amount of effort, not to mention the extra headaches of debugging queries using encrypted data!
If protecting the data in the database is the goal, I recommend encrypting the database as a whole, and using authentication and access control to ensure data is provided only to your program and no unauthorized access. If the database falls into wrong hands, under this system, the evil scoundrels will have to figure out the username/password or other credentials use to authenticate a legitimate user to gain access. Typically this means either a brute force search, or reverse engineering your code (if the credentials are stored in your program - not such a good idea.)
If you encrypt all the data in the database on a row-by-row level, so that it comes to your program encrypted, it must still be decrypted by your program. The secret key can be found by reverse engineering your code.
So, I hope you see, encrypting each returned result set will be complex to implement, yet is no more secure than using readily available solutions (database file encryption and authentication/access control).
EDIT: I've written this with a local database in mind, since the OP talks about the attacker getting hold of the database. On the other hand, if you are using a remote database server, which is physically secure, a protocol with transport layer encryption, e.g. HTTPS will give you what you want. An attacker in the middle will not be able to make any sense of the data going between your program and the database. It's also transparent - your data access code does not need to change at all.
First of all, encryption won't normally increase size, except possibly to the next multiple of the encryption algorithm's block size (e.g., 128 bit boundary).
Second, yes, if the data is left in the clear, an attacker can probably make sense of at least quite a bit of it fairly quickly.
Third, the big problem with all of this (as with most cryptology) is key storage and distribution. At some point you have to decrypt the data, and (often) the easiest form of attack is to find a way to retrieve that key. Your two main choices are to require the user to enter a key, or to have some sort of protected storage (e.g., a smart card) to store the key, and use smartcard readers on the client computers.
Depending on the database you're using, it may be able to handle a lot of this for you. A fair number have some sort of row-level or even column-level encryption to help comply with privacy requirements (e.g. Sarbanes-Oxley in the US).