views:

226

answers:

2

Suppose I have a series of index numbers that consists of a check digit. If I have a fair enough sample (Say 250 sample index numbers), do I have a way to extract the algorithm that has been used to generate the check digit?

I think there should be a programmatic approach atleast to find a set of possible algorithms.

UPDATE: The length of a index number is 8 Digits including the check digit.

+4  A: 

No, not in the general case, since the number of possible algorithms is far more than what you may think. A sample space of 250 may not be enough to do proper numerical analysis.

For an extreme example, let's say your samples are all 15 digits long. You would not be able to reliably detect the algorithm if it changed the behaviour for those greater than 15 characters.

If you wanted to be sure, you should reverse engineer the code that checks the numbers for validity (if available).

If you know that the algorithm is drawn from a smaller subset than "every possible algorithm", then it might be possible. But algorithms may be only half the story - there's also the case where multipliers, exponentiation and wrap-around points change even using the same algorithm.

paxdiablo
A: 

paxdiablo is correct, and you can't guess the algorithm without making any other assumption (or just having the whole sample space - then you can define the algorithm by a look up table).

However, if the check digit is calculated using some linear formula dependent on the "data digits" (which is a very common case, as you can see in the wikipedia article), given enough samples you can use Euler elimination.

r0u1i