Any decent encryption algorithm has as one of its core features the fact that it's impossible to deduce anything about the plaintext just by looking at the encrypted text. If you were able to tell, just by looking at the encrypted text, that the plaintext contained the string william
, any attackers would be able to get that information just as easily, and you may as well not be encrypting at all.
The only way to perform this kind of operation on the data is to have access to the decrypted data. Using the model you've described - where the database only ever sees the encrypted data - it's not possible for the database to do this work, as the database has no access to the data it needs.
You need to have the data you're wanting to search on decrypted. The only complete way to do this is to have the application pull all the data out of the database, decrypt it, then do the filtering/sorting/whatever in your application. Obviously this is not going to scale well - but that's surely something you took into consideration when you decided to encrypt the data before putting it in the database.
Another option would be to store fragments of the data unencrypted. For example, if you have a first_name
field and you want to be able to retrieve all records where first_name
begins with a
, have a first_name_first_letter
field. Obviously this isn't going to scale well either - if you want to search for all records where first_name
contains ill
, you're going to have to store the complete first_name
unencrypted.
There's a more serious problem with this solution though: by storing unencrypted data, you're leaking information about the encrypted data. The more unencrypted data you store, the more you leak. The more you leak, the more clues you're leaving for an attacker to defeat your encryption - plus, if you've stored the bit they were interested in unencrypted, they've already won.