views:

120

answers:

3

The problem seems simple at first: just assign an id and represent that in binary.

The issue arises because the user is capable of changing as many 0 bits to a 1 bit. To clarify, the hash could go from 0011 to 0111 or 1111 but never 1010. Each bit has an equal chance of being changed and is independent of other changes.

What would you have to store in order to go from hash -> user assuming a low percentage of bit tampering by the user? I also assume failure in some cases so the correct solution should have an acceptable error rate.

I would an estimate the maximum number of bits tampered with would be about 30% of the total set.

I guess the acceptable error rate would depend on the number of hashes needed and the number of bits being set per hash.

I'm worried with enough manipulation the id can not be reconstructed from the hash. The question I am asking I guess is what safe guards or unique positioning systems can I use to ensure this happens.

A: 

So you're trying to assign a "unique id" that will still remain a unique id even if it's changed to something else?

If the only "tampering" is changing 0's to 1's (but not vice-versa) (which seems fairly contrived), then you could get an effective 'ID' by assigning each user a particular bit position, set that bit to zero in that user's id, and to one in every other user's id.

Thus any fiddling by the user will result in corrupting their own id, but not allow impersonation of anyone else.

Anon.
I like where this is going but I would not like to have to set a bit for each id being used.
Mark
+2  A: 

Your question isn't entirely clear to me.

Are you saying that you want to validate a user based on a hash of the user ID, but are concerned that the user might change some of the bits in the hash?

If that is the question, then as long as you are using a proven hash algorithm (such as MD5), there is very low risk of a user manipulating the bits of their hash to get another user's ID.

If that's not what you are after, could you clarify your question?

EDIT

After reading your clarification, it looks like you might be after Forward Error Correction, a family of algorithms that allow you to reconstruct altered data.

Essentially with FEC, you encode each bit as a series of 3 bits and apply the "majority wins" principal when decoding again. When encoding you represent "1" as "111" and "0" as "000". When decoding, if most of the encoded 3 bits are zero, you decode that to mean zero. If most of the encoded 3 bits are 1, you decode that to mean 1.

Eric J.
I am not worried about the user manipulating their id to match someone else's, I'm worried with enough manipulation the id can not be reconstructed from the hash. The question I am asking I guess is what safe guards or unique positioning systems can I use to ensure this happens.
Mark
Of course "majority wins" isn't exactly the right strategy in this case, since "101" must indicate a tampered-with 0. Also you may want to use more than 3 bits in a row since the probability of three bits in a row being flipped isn't negligable.
Jason Orendorff
The Forward Error Correction algorithm assumes that bits can randomly flip (i.e. due to line noise on a modem or a hardware defect on a drive). If the bits can truly only be changed from 0 to 1 and not from 1 to 0, Jason is correct that you would assume anything other than 111 was originally a 0. I still don't understand the use case that bits can only be flipped in one direction, but that's a different matter.
Eric J.
A: 

Assign each user an ID with the same number of bits set.

This way you can detect immediately if any tampering has occurred. If you additionally make the Hamming distance between any two IDs at least 2n, then you'll be able to reconstruct the original ID in cases where less than n bits have been set.

Jason Orendorff