Hi,
I want to encrypt a string and embed it in a URL, so I want to make sure the encrypted output isn't bigger than the input.
Is AES the way to go?
Hi,
I want to encrypt a string and embed it in a URL, so I want to make sure the encrypted output isn't bigger than the input.
Is AES the way to go?
It's impossible to create any algorithm which will always create a smaller output than the input, but can reverse any output back to the input. If you allow "no bigger than the input" then basically you're just talking isomorphic algorithms where they're always the same size as the input. This is due to the pigeonhole principle.
Added to that, encryption usually has a little bit of padding (e.g. "to the nearest 8 bytes, rounded up" - in AES, that's 16 bytes). Oh, and on top of that you're got the issue of converting between text and binary. Encryption algorithms usually work in binary, but URLs are in text. Even if you assume ASCII, you could end up with an encrypted binary value which isn't ASCII. The simplest way of representing arbitrary binary data in text is to use base64. There are other alternatives which would be highly fiddly, but the general "convert text to binary, encrypt, convert binary to text" pattern is the simplest one.
Simple answer is no.
Any symmetric encryption algorithm ( AES included ) will produce an output of at minimum the same but often slightly larger. As Jon Skeet points out, usually because of padding or alignment.
Of course you could compress your string using zlib and encrypt but you'd need to decompress after decrypting.
Disclaimer - compressing the string with zlib will not guarantee it comes out smaller though
If we are talking about symetric encription to obtain the original encrypted string from a cyphered one it is not possible. I think that unless you use hashes (SHA1, SHA256...) you will never obtain a cyphered string smaller than the original text. The problem with hashes is that they are not the solution for retrieving the original string because they are one way encryption algorithms.
When using AES, the output data will be rounded up to have a specific length (e.g a length divisible trough 16).
If you want to transfer secret data to another website, a HTTP post may do better than embedding the data into the URL.
What matters is not really the cipher that you use, but the encryption mode that you use. For example the CTR mode has no length expansion, but every encryption needs a new distinct starting point for the counter. Other modes like OFB, CFB (or CBC with ciphertext stealing) also don't need to be padded to a multiple of the block length of the cipher, but they need an IV. It is unclear from your question if there is some information available from which an IV could be derived pseudorandomly an if any of these modes would be appropriate. It is also unclear if you need authentication, or if you need semantic security> i.e. is it a problem if you encrypt the same string twice and you get the same ciphertext twice?
Also just another thing to clarify:
Not only is it true that symmetric encryption algorithms produce an output that is at least as large as the input, the same is true of asymmetric encryption.
"Asymmetric encryption" and "cryptographic hashes" are two different things.
Asymmetric encryption (e.g. RSA) means that given the output (i.e. the ciphertext), you can get the input (i.e. the plaintext) back if you have the right key, it's just that decrypting requires a different key than the key used for encrypting. For asymmetric encryption, the same "pigeonhole principle" argument applies.
Cryptographic hashes (e.g. SHA-1) mean that given the output (i.e. the hash) you can't get the input back, and you can't even find a different input that hashes to the same value (assuming the hash is secure). For cryptographic hashes, the hash can be shorter than the input. (In fact the hash is the same size regardless of the length of the input.
And also one more thing: In any secure encryption system the ciphertext will be longer than the plaintext. This is because there are multiple possible ciphertexts that any given plaintext could encrypt to (e.g. using different IVs.) If this were not the case then the cipher would leak information because if two identical plaintexts were encrypted, they would encrypt to identical ciphertexts, and an adversary would then know that the plaintexts were the same.