views:

190

answers:

5

I am writing a program that and like to implement data verification system. It needs to return a unique string for any value entered. My question boils down to: is it possible for an AES function to return the same value for two different entries? The source values will becoming from data held on a magnetic stripe card.

more details

I posted this through my phone originally, and I am now just getting back to this post.

I've been looking around the web and while reading Wikipedia's article on SHA, I see that SHA-2 (SHA-256/224, SHA-512/384) have no detected collisions (assuming the article is accurate/up-to-date). This is desirable. Any recommendations on what version of SHA-2 I should use?

A: 

AES will not necessarily return the same value for the same input twice, given only the same key.

You should instead use a strong hashing algorithm, like SHA.

However, to answer your question, it is not possible for AES to return the same value for two different inputs, given the same Key and IV.

John Gietzen
"AES will not necessarily return the same value for the same input twice." - this is false. The result is always the same. Besides you change the key or initialisation vector. But this does not count, because it implies that hashes do not always return the same value,
Daniel Brückner
because you might change the salt.
Daniel Brückner
+2  A: 

AES can never return the same value as long as you use the same key and initialization vector for all calculations. You would just encrypt the data. Usually you would just use a hash algorithm because all hashes have the same length independently of the input while AES gives results proportional in length to the input.

The reason why it is not possible is quite obvious - if AES would encrypt different inputs to the same output, you could not decrypt the message again, because there would be multiple possible decypted messages.

Daniel Brückner
+2  A: 

AES is an encryption scheme, not a hashing scheme, so in its straightforward application, it will return a lump of data as long as your message, but encrypted. For any unique message, the ciphertext will also be unique.

It sounds like what you want is a hash, or 'digest' of your data - look at something like SHA256. This will give you a fixed-length result regardless of the length of your data. This means that there ARE inevitably multiple different inputs which will give the same output, but they're incredibly thinly spread across an incredibly large space.

The type of hash you should use depends on whether you're trying to protect against malicious attempts to subvert your scheme, or just against random errors.

Will Dean
Random errors, but in the case of this application, these random errors could be disastrous.
Anders
What kind of disastrous? World war? Lose all your money? Children dead? Are you sure you're qualified here...
Will Dean
I am not the one doing the whole application, I am just doing some research atm. Basically, if two different people swipe their cards and the AES/SHA function returns an identical value, person A could get person B's information among other things.
Anders
Okay, what we want here is a way to identify pieces of data, something like a password. All we care about is equality and inequality, not error correction. Try this for research: "cryptographic hash".
David Thornley
Awesome, thanks!
Anders
Can I really strongly recommend Ross Anderson's book here - it's a great read anyway, but also will instil in you the appropriate amount of fear! There's a load of it free from here http://www.cl.cam.ac.uk/~rja14/book.html . People endlessly repeat basic implementation errors in this area...
Will Dean
Thanks again. I will read these excerpts and possibly pick this book up
Anders
A: 

Bear in mind that there are two sorts of schemes here.

One is for cryptography, and the idea is that the ciphertext or cryptographic hash give away as little as possible about the original. This is what you want for security purposes.

One is for error detection and correction, and in that case you want as much information about the original as you can get. This is what you want for data integrity purposes.

If you take a cryptographic hash of a file, all you can tell later is whether it's the exact same file or not. If you use some sort of error detection scheme, you can probably tell something about the error.

David Thornley
+1  A: 

Based on the new question, I would say:

You should use whatever version of SHA you would like, given that you have the storage space to hold it.

I almost always use SHA-512 for everything, because it holds the lowest chance of collisions, and 64 bytes is usually small enough to store.

John Gietzen