tags:

views:

84

answers:

3

Hi, I was browsing and found good articles about encryption. However, none of them described why the key lenght is important and what exactly the key is used for. My guess is that could work this way:

             0101001101010101010
Key:         01010010101010010101   //the longer the key, the longer unique sequence
XOR or smth: //result

Is this at least a bit how it works or I am missing something? Thanks

A: 

It's been a long time (over 25 years) since my undergraduate cryptography course, but I'll try to help.

While not a perfect example for things in modern cryptography, perhaps easier understanding will come by considering a Vigenere cipher, using strictly alphabetical plain text and key. In such a cipher, your key would be a sequence of letters - often a word or phrase - and is used to rotate through a series of Caesar ciphers to encrypt the plain text. For example, a plain text of "dog" with a key of "cat" will lead to encrypted text of "foz" (assuming I've done that correctly).

This simple example - in theory - is unbreakable, since the key and the message text are the same length. However, when the message is long, compared to the key, statistical analysis will point to patterns which can be used to figure out the key; for two way ciphers such as this, the key directly cracks the encrypted message. The frequency of these patterns coincides with the length of the key and are easy to discover with computers (which is why things like xor and one way mathematical functions are now used). Thus, a longer key makes it more difficult to crack the cipher.

GreenMatt
+4  A: 

Cipher systems are generally a lot more sophisticated than the XOR system you have described. In fact, XORing only really works if you have a key that has at least as many bits as the plain text message, and you only ever use the key to encrypt one message. This is called "one time pad" encryption.

In most crypto systems, the key is not directly combined with the plain text. Rather it controls the transformation performed by the crypto algorithm. Ideally, the transformation performed by the algorithm is too complicated to reverse without knowing or guessing the key. Thus, in the ideal situation, the number of bits in the key determines the average number of guesses needed to crack by "brute force".

In practice however, a cryptologist can glean clues from the cipher text (i.e. the encrypted message) that can reduce the search space, so that a larger number of bits need to be used. This is particularly true for the public key encryption systems that we normally use. These systems are based on mathematical calculations that are hard to reverse (e.g. multiplication of pairs of large prime numbers) and the number of bits determines the "problem" size of performing the reverse calculation (e.g. factorization of the product of two large primes).

You are interested in how the key is "applied" to the plain text. The answer to that is that it varies from one crypto system to another, and that the process is typically rather complicated. The question is not amenable to a general answer.

Stephen C
A: 

Let's consider symmetric encryption. This is how you convert some data into something unreadable, except for those who have some secret knowledge which enables to reverse the operation. This is "symmetric" because the knowledge needed to encrypt is the same than the knowledge needed to decrypt.

Historically, the "secret knowledge" consisting in the description of the steps to be performed to scramble the data. At some point, it became interesting to split the "encryption method" into two parts, one being the algorithm (the generic description of the method steps) and the second being the key (the precise parameters used in that occasion). On a computer, the algorithm becomes the program which the machine runs; the key is a numeric parameter, a sequence of zeros and ones. Computers exacerbate the need for that split: a computer program is very hard to keep confidential. A program is bulky, it is written to a harddisk... on the other hand, a key is short enough to fit in a human brain, to be typed when used, to be stored in a device as puny as a cheap smartcard, and so on.

In a way, the key concentrates the secret. Everything which is secret in the encryption lies in that key.

In you XOR example, the method is then: "we XOR the data with the key" and the key is the actual sequence of zeros and ones with which to XOR. The encryption-by-XOR method is known to be extremely difficult to use properly (basically, the key has to be as long as the data to encrypt, which is very often highly impractical; but using a shorter key means reusing some parts of the key, which opens many deadly weaknesses). Usual symmetric encryption systems are much more complex, and strive at providing "your money worth" from the key.

How much a key is worth ? Well, since the key is the only part of the whole thing which is secret, all the attacker has to do is to guess the key. The most basic form of guessing is exhaustive search: the attacker simply tries all possible keys. The number of possible keys grows quite fast with the key length. For instance, a 10-bit key has 1024 possible keys. With a 20-bit key, there are 1048576 possible keys. Each additional bit doubles the number of possible keys.

An algorithm is deemed "robust" if the best known attack is the exhaustive search. In that sense, a robust algorithm is an algorithm which offers no better attack path than the brutal exhaustive search, which is always "doable" but can be prohibitively expensive. If the algorithm is robust and the used key is large enough to thwart exhaustive search, then the encryption is secure. "Large enough", in practical terms, means "a hundred bits, or more". The largest key which has been attacked by exhaustive search is a 64-bit key. The number of combinations was huge (18446744073709551616) by could be achieved with existing technology (it took several years and thousands of computers). The continuous progress in computer speed makes exhaustive search better and better, but not as fast as could be imagined. Basically, exhaustive search increases by 1 bit every 12 to 18 months (this is known as "Moore's law"). There are many good reasons why Moore's law cannot be upheld beyond about year 2020 (that's what Gordon Moore was predicting himself, actually), at which point computing power for a given price will somehow stagnate. A 100-bit key should then remain safe (that's way too many combinations for what computer will be able to do in 10 years). Usual encryption algorithms use a 128-bit key because "128" looks good when written in binary.

There are also asymmetric encryption, and digital signatures. The basic principle remains the same: the key is the part which is secret, as opposed to the method (the program) which is known by everybody. Asymmetric encryption and digital signatures try to do complicated things, in which, for instance, someone who knows how to encrypt does not have enough information to know how to decrypt. This requires some elaborate mathematical structures. Keys are then mathematical objects, which need more bits when they are encoded. There again, the number of possible keys grows with the key size, but not as fast, because very few sequence of bits will have the required mathematical structure. Also, that very same structure allows for faster attack paths, implying then again larger keys to achieve security. This is why we hear about things like "a 1024-bit RSA key". A RSA key consists in a few integers, the most important being such that it should be impractical to factor it into a product of prime integer. A 1024-bit integer (an integer which, when written in binary, consists in 1024 zeros and ones -- that would be a bit more than 300 digits in decimal) seems big enough to thwart factorization (current record is for a 768-bit integer).

Thomas Pornin