views:

211

answers:

6

I would like to know the size of data after AES encryption so that I can avoid buffering my post-AES data(on disk or memory) mainly for knowing the size.

I use 128 bit AES and javax.crypto.Cipher and javax.crypto.CipherInputStream for encryption.

A few tests performed with various input sizes show that, the post encryption size calculated as below is correct:

long size = input_Size_In_Bytes; long post_AES_Size = size + (16 - (size % 16));

But I am not sure whether the above formula is applicable for all possible input sizes.

Is there a way to calculate the size of data after applying AES encryption – in advance without having to buffer the encrypted data(on disk or memory) to know its post-encryption size?

Thanks.

+2  A: 

The AES cipher always works on 16-byte (128-bit) blocks. If the number of input bytes is not an exact multiple of 16, it is padded. That's why 16 appears to be the "magic number" in your calculation. What you have should work for all input sizes.

In silico
+8  A: 

AES, as a block cipher, does not change the size. The input size is always the output size.

But AES, being a block cipher, requires the input to be multiple of block size (16 bytes). For this, padding schemes are used like the popular PKCS5. So the answer is that the size of your encrypted data depends on the padding scheme used. But at the same time all known padding schemes will round up to the next module 16 size (size AES has a 16 bytes block size).

Remus Rusanu
+1  A: 

AES works in 128-bit (16 bytes) blocks and converts cleartext blocks into cyphertext blocks of the same length. It pads the last block if it is shorter than 16 bytes, so your formula looks correct.

wRAR
+3  A: 

It depends on the mode in which you use AES. What you have is accurate for most of the block oriented modes, such as ECB and CBC. OTOH, in CFB mode (for one example) you're basically just using AES to produce a stream of bytes, which you XOR with bytes of the input. In this case, the size of the output can remain the size of the input rather than being rounded up to the next block size as you've given above.

Jerry Coffin
+1  A: 

AES has a fixed block size of 16-bytes regardless key size. Assuming you use PKCS 5/7 padding, use this formula,

 cipherLen = (clearLen/16 + 1) * 16;

Please note that if the clear-text is multiple of block size, a whole new block is needed for padding. Say you clear-text is 16 bytes. The cipher-text will take 32 bytes.

You might want to store IV (Initial Vector) with cipher-text. In that case, you need to add 16 more bytes for IV.

ZZ Coder
A: 

There are approaches to storing encrypted information which avoid the need for any padding provided the data size is at least equal to the block size. One slight difficulty is that if the data size is allowed to be smaller than the block size, and if it must be possible to reconstruct the precise size of the data, even for small blocks, the output must be at least one bit larger than the input, [i]regardless[/i] of the data size.

To understand the problem, realize that there are 256^N possible files that are N bytes long, and the number of possible files that are no longer than N bytes long is 256^N plus the number of possible files that are no longer than N-1 bytes long (there is one possible file that's zero bytes long, and 257 possible files that are no longer than one byte long).

If the block size is 16 bytes, there will be 256^16 + 256^14 + 256^13 etc. possible input files that are no more than 16 bytes long, but only 256^16 possible output files that are no more than 16 bytes long (since output files can't be shorter than 16 bytes). So at least some possible 16-byte input files must grow. Suppose they would become 17 bytes. There are 256^17 possible seventeen-byte output files; if any of those are used to handle inputs of 16 bytes or less, there won't be enough available to handle all possible 17-byte input files. No matter how big the input can get, some files of that size or larger must grow.

supercat