views:

627

answers:

4

Hello, I'm writing an implementation of the XXTEA encryption algorithm that works on "streams", ie, can be used like: crypt mykey < myfile > output.

One of the requisites is that it doesn't have access to the file at all (it only reads an fixed size block until find an EOF). The algorithm needs that the data bytes is multiple of 4, so its needed to add a padding.

For plain text a good solution is to pad with NULLs, and in the decryption just ignore the NULLs, but the same strategy cannot be used for binary streams (that can contain embedded NULLs).

I've read the common solutions, like padding with the number of missing chars (if it miss 3 chars, then append an 3, 3, 3 at the end) and etc, but I wonder: theres a more elegant solution?

+3  A: 

Read: http://msdn.microsoft.com/en-us/library/system.security.cryptography.paddingmode.aspx

It has a list of common padding methods, like:

PKCS7 - The PKCS #7 padding string consists of a sequence of bytes, each of which is equal to the total number of padding bytes added.

The ANSIX923 padding string consists of a sequence of bytes filled with zeros before the length.

The ISO10126 padding string consists of random data before the length.

Examples:

Raw data: 01 01 01 01 01

PKCS #7: 01 01 01 01 01 03 03 03

ANSIX923 01 01 01 01 01 00 00 03

ISO10126: 01 01 01 01 01 CD A9 03

AlbertEin
+3  A: 

Read up on ciphertext stealing. It's arguably much more elegant than plaintext padding. Also, I'd suggest using a block size larger than 4 bytes -- 64 bits is probably the bare minimum.

Strictly speaking, do-it-yourself cryptography is a dangerous idea; it's hard to beat algorithms that the entire crypto community has tried and failed to break. Have fun, and consider reading this, or at least something from Schneier's "related reading" section.

ojrac
He is using a well known peer-reviewed published algorithm. Using off the shelf closed source libraries wouldbe a bad idea, the algorithms might be great but you don't knwo what backdoors they have.
Martin Beckett
He also hasn't said what chaining mode he's using. It's easy to use a recognised, secure cipher and still construct an insecure cryptosystem.
Nick Johnson
A: 

Actually I would expect that a good stream cipher needs no padding at all. RC4 for example needs no padding and is is a very strong stream cipher. However, it can be attacked if the attacker can feed different chosen data to the encryption routine, that always uses the same key, and also has access to the encrypted data. Choosing the right input data and analyzing the output data can be used to restore the encryption key, without a brute force attack; but other than that RC4 is very secure.

If it needs padding, it is no stream cipher IMHO. As if you pad to be a multiple of 4 byte or a multiple of 16 byte, what's the huge difference? And if it is padded to be a multiple of 16 byte, you could use pretty much any block cipher. Actually your cipher is a block cipher, it just works with 4 byte blocks. It was a stream cipher on a system where every "symbol" is 4 byte (e.g. when encryption UTF-32 text, in which case the data will always be a multiple of 4 for sure, thus there is never any padding).

Mecki
TEA is a block cipher, he just wants to use it with a stream program.Padding is the standard solution, with either a magic value that you can remove, or with a header giving the data length.
Martin Beckett
+2  A: 

Reading the question it looks like the security aspect of this is moot. Simply put, you have an api that expects a multiple of 4 bytes as input, which you dont always have.

Appending up to 3 bytes onto any binary stream is dangerous if you can't make guarantees that the binary stream doesn't care. Appending 0's onto the end of an exe file doesnt matter as exe files have headers specifying the relevent sizes of all remaining bits. Appending 0's onto the end of a pcx file would break it as pcx files have a header that starts a specific number of bytes from the end of the file.

So really you have no choice - there are no choice of magic padding bytes you can use that are guaranteed to never occur naturally at the end of a binary stream: You must always append at least one additional dword of information describing the padding bytes used.

Chris Becke