ansaurus

Question

Answer 1

+2 A:

Hi again, I answered the other question too :P

The best way is to generate one alphanumeric character at a time, randomly, until you have 8 of them. This will then be your voucher.

Ideally the best way would be to choose a sequence long enough so that you can safely assume if there will be any duplicates. Do note that, perhaps counter-intuitively, this happens more often than you think because of the Birthday problem.

For example, with 8 characters you have 1785793904896 possible combinations, but if you generate only 1,573,415 vouchers you will have a 50% chance to have a duplicate.

So, it all depends on how many you want to generate, and the maximum length of the code you're comfortable with. If you are generating many and you want to keep it short, you should save the ones you previously generated, and check against the database for duplicates.

Andreas Bonini 2010-01-18 15:24:49

+1 n^k is quite a lot (here: 26^10)

Thomas Jung 2010-01-18 15:30:50

Hello again Andreas! I agree that for a sufficiently-sized keyspace the odds start to look pretty good. I guess I just have my computer scientist hat on right now and want a general purpose approach that is guaranteed to work - not just something that is likely to work given the current inputs.

Barry Fandango 2010-01-18 15:33:46

That's not completely true about GUIDs at least for v1 algorithm, unless you generate more than (I think) 4 per ms and assuming the machine that's generating it has a device with a valid MAC address you're guaranteed uniqueness due to the timestamp component of the GUID as well as the MAC address component (and then a few random things thrown in). Other versions may not have a strict guarantee and may indeed be close enough to guaranteed by the sheer size of the GUID pool.

Davy8 2010-01-18 15:36:47

If there are 1785793904896 possible combinations, then after generating about 1,573,415 at random, without testing for duplicates, the probability of collision reaches 0.5. See http://en.wikipedia.org/wiki/Birthday_problem .

Jason Orendorff 2010-01-18 15:36:50

-1: The birthday paradox means that the probability of a collision increases dramatically with the number of used codes and starts approaching 50% somewhere around 1.3 million codes. GUIDs get around this by being just ridiculously long (too long to be practical for this purpose)

Michael Borgwardt 2010-01-18 15:37:58

@Davy8: Well, still, take a look here: http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Random_UUID_probability_of_duplicates

Andreas Bonini 2010-01-18 15:39:56

Jason/Micheal: you're right! I'll edit my question to reflect this

Andreas Bonini 2010-01-18 15:40:54

-1 rescinded :)

Michael Borgwardt 2010-01-18 16:04:15

Answer 2

+5 A:

Some random number generators have an interesting property: Used right they do not generate duplicate numbers in a long time. They produce something called a full cycle. Use one of the algorithms described there, seed it, and you will have many unique numbers,

Add a smart way to map digits to characters and you got your codes.

ebo 2010-01-18 15:27:00

Be aware that this - in some specific circumstances - can pose a serious security risk. If an attacker manages to get his hands on many vouchers that he knows have been generated subsequentially he can obtain the codes for all the vouchers by "reverse engineering" (probably not the technical term for this) the pseudo-RNG.

Andreas Bonini 2010-01-18 15:33:28

@Andreas - And if you use a salt value: voucher(previousVoucher + salt)?

Thomas Jung 2010-01-18 15:40:18

t. jung - could you salt the value without losing the "full cycle" property of the algorithm?

Barry Fandango 2010-01-18 15:42:14

@Barry - It would not mess with the RNG if: voucher(previousVoucher, salt) = rng(previousVoucher) + salt. The question is if this makes the approach safer. If you use only the RNG and it and one voucher is known it is broken.

Thomas Jung 2010-01-18 15:59:12

@Thomas is that any better than just adding some random junk to the end of the code and then stripping those digits off before checking it?

Martin Beckett 2010-01-18 16:15:17

Martin: Yes. Checking the random junk is the important part because that's the part that an attacker, however clever, can't predict.

Jason Orendorff 2010-01-18 16:21:37

@Martin - I would say yes. Adding and removing junk can be seen as part of the RNG algorithm. The approach has to work even when the RNG and vouchers are known. So you have to add a secret that is not known and will not be communicated. I would like to hear from someone if the salt approach is dumb idea.

Thomas Jung 2010-01-18 16:23:06

Just use a cryptographically secure PRNG (http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator); don't listen to this salt nonsense. An example of a good generator would be SHA1-HMAC(LongSecretPassphrase, counter). Even our best super-computers have yet to find two SHA1 values that hash to the same thing, so you are unlikely to do it by accident generating a few hundred-thousand gift card codes.

BlueRaja - Danny Pflughoeft 2010-01-18 17:34:05

See http://stackoverflow.com/questions/1650185/php-short-unique-id-generation-using-autoincrement/1650945#1650945 for in implementation of the RNG idea I did a while back.

Wim 2010-01-18 22:35:31

@BlueRaja that avoids the collision by having a hash space much larger than the gift-card number space. Which is what we are trying to avoid.Yes it's difficult to calculate what to add to a known file to generate a SHA1 collision, but if you simply start hashing consecutive numbers - you are back to the birthday paradox

Martin Beckett 2010-01-19 14:40:11

@Martin: As stated by Michael above, you **must** have a keyspace much larger than your gift-card space. As for the birthday paradox, that is a problem with or without a "salt": The birthday paradox states that you need to hash 2^80 SHA1 values before you have a 50% probability of having a collision. That's 1 hash every nanosecond for 33 million years. That's exactly the reason no one has even *found* two values that hash to the same thing.

BlueRaja - Danny Pflughoeft 2010-01-19 15:23:44

But Sha1 means you need a 160bit result to code for a 3-4 digit gift card number - the OP wanted a way of having a series of much smaller (easier to type) codes without a collision. Thats why the RNG is a good solution.

Martin Beckett 2010-01-19 15:40:46

Answer 3

+5 A:

The likelihood of two randomly generated code colliding is basically the same as a user guessing a valid code - and you cannot prevent users from guessing. So you must have a key space so much larger than the number of actually used codes that random collisions are extremely unlikely as well (though, thanks to the birthday paradox, probably not unlikely enough to ignore them completely, at least if you want your codes to be reasonably short), and checking against existing codes and re-generating in case of a collision is a perfectly viable strategy.

Michael Borgwardt 2010-01-18 15:33:06

+1 for storing and regenerating on collision. To get to a 50% chance of regenerating an individual key with just 10 uppercase letters, you'd have to have (26^10)/2 keys stored which is around 70.5 trillion, so the likelihood of having to regenerate so many times that it becomes infeasible is negligible. 90% of the time (made up number) you're not going to have to regenerate.

Davy8 2010-01-18 15:53:13

If you had to regenerate even 10% of the time, it would mean that a random guess has a 10% chance of hitting a valid code - you want that likelihood to be much lower. I'd say that making the code space 1000 times larger than the number of codes you could theoretically expect to need should be OK.

Michael Borgwardt 2010-01-18 16:02:48

That will be fun code to test. Once executed in a lifetime and then probably broken.

Thomas Jung 2010-01-18 16:04:00

There definitely should be an automated test for that kind of thing...

Michael Borgwardt 2010-01-18 16:10:06

Answer 4

+1 A:

I think the best way to go is that suggested by Andreas. But my answer is about an interesting related discussion.

You want to generate a sequence of numbers that together form a permutation of S = {1, ..., MAX}. One way to do this is to take the elements of a cyclic group over S. For example, the numbers R = {x modulo p, x^2 modulo p, x^3 modulo p, ..., x^(p-1) modulo p} form a cyclic group over {1, ..., p-1}, provided p is a prime and x is coprime to p. So if you choose MAX as a prime number you do use this sequence.

You want a "tough-to-crack" sequence. A generator for the sufficiently-tough-to-crack sequence is called a pseudorandom generator (ofcourse you probably don't need that tough-to-crack). An example is the last digit of elements in R above, provided p is kept secret (am I correct?). But the answer by Andreas already uses a source of (pseudo-) random numbers, so cannot be called a pseudorandom generator.

If you are interested in pseudorandom generators, they are discussed in detail in volume 2 of Knuth's well-known book.

Amit Kumar 2010-01-18 15:38:57

Answer 5

A:

Here is a though:

ID = each voucher has an unique (auto-incremented?) ID
CHECKSUM = apply N iterations of the Verhoeff or Luhn algorithm on the ID
VOUCHER = base convert the generated CHECKSUM from base 10 to base 36

See also this related SO question: Ideas to create a small (<10 digits), not (very) secure “hash”.

One simple way to make this method more secure is to use a non auto-incremented ID value, one option might be to use ID as the last 6 or 7 digits of a UNIX timestamp and calculate the checksum.

Alix Axel 2010-01-18 15:39:57

I'm not familiar with those algorithms but i'll go read about them, thanks.

Barry Fandango 2010-01-18 15:44:02

So basically you depend on people trying to guess codes to not know N and which algorithm you're using? This is called "security through obscurity", and not good.

Michael Borgwardt 2010-01-18 15:53:48

@Michael: Indeed, but he stated he wanted "low guessability" not "no guessability at all".

Alix Axel 2010-01-18 15:57:00

@Michael: See my edit in a couple of seconds.

Alix Axel 2010-01-18 16:10:25

If you can figure out how to generate valid codes at will, it's not even guessing anymore - the security is simply broken. And all it takes is one person to figure it out (or know it, e.g. a disgruntled employee), and a few days later 10,000 people have downloaded a voucher generator...

Michael Borgwardt 2010-01-18 16:13:39

Using timestamps is even worse - no less predictable, but now you're generating duplicates all the time.

Michael Borgwardt 2010-01-18 16:17:40

Answer 6

+3 A:

I would say to use a "perfect hash" - http://en.wikipedia.org/wiki/Perfect_hash_function combined with a 4-digit random number...

So just increment your voucher code each time, then hash it, add a 4 digit random number and I would also add a check digit to the end (as Alix Axel suggested).

This would be very secure with no clashes - for example if someone worked out your hashing algorithm, they would also have to guess the 4-digit code at the end...

Mark 2010-01-18 15:51:44

Answer 7

+2 A:

Programming Pearls has several examples of algorithms to generate sets of random numbers, you should read it if you're interested in this kind of problem.

The book shows that if you generate m random numbers with value less than n, the simple approach of generating numbers and throwing out duplicates will generate no more than 2m random numbers if m < n / 2. Here it is, in C++:

void gensets(int m, int n)
{
    set<int> S;
    set<int>::iterator i;
    while (S.size() < m) {
        int t = bigrand() % n;
        S.insert(t);
    }
    for (i = S.begin(); i != S.end(); ++i)
        cout << *i << "\n";
}

Obviously, if you're worried about people guessing values, you will want m to be much less than n / 2.

There's even a set-based algorithm to generate m random numbers less than n with each value being equally likely, no duplicates, and a guarantee not to generate more than m random numbers:

void genfloyd(int m, int n)
{
    set<int> S;
    set<int>::iterator i;
    for (int j = n-m; j < n; j++) {
        int t = bigrand() % (j+1);
        if (S.find(t) == S.end())
            S.insert(t); // t not in S
        else
            S.insert(j); // t in S
    }
    for (i = S.begin(); i != S.end(); ++i)
        cout << *i << "\n";
}

The order of the numbers isn't random, though, so this is probably not a good choice for you.

RossFabricant 2010-01-18 15:57:49

Answer 8

+6 A:

Use an N-bit serial number R, combined with an M-bit hash H of the concatenated pair (R, S) where S is some secret "salt" S which you do NOT publish. Then encode the pair (R,H) alphanumerically in any reversible way you like. If you like algorithms like MD5* or SHA, but the bit count is too high, then just take the M least significant bits of a standard hash algorithm.

You can verify easily: decode the alphanumeric encoding so you can see R and H. Then compute H' = hash(R+S) and verify that H = H'.

edit: R can be an incrementing serial number or random or whatever, just make sure you use each value not more than once.

*before someone says "MD5 is broken", let me remind you that the known weaknesses for MD5 are collision attacks, and not preimage attacks. Also, by using an unpublished, secret salt value, you deny an attacker the ability to test your security mechanism, unless he/she can guess the salt value. If you feel paranoid, pick two salt values Sprefix and Ssuffix, and calculate the hash of the concatenated triple (Sprefix,R,Ssuffix).

Jason S 2010-01-18 17:19:40

But what advantage does a cryptographic hash have over random numbers?

Jason Orendorff 2010-01-18 22:35:33

simplicity! pseudorandom numbers used for this purpose may be unguesssable, but they would still need to be unique and verifiable.

Jason S 2010-01-18 22:42:28

Answer 9

+1 A:

This is a summary of the best bits of all the other answers. :)

You need to generate gift card numbers that are:

unique
unguessable

Random numbers are unguessable but not necessarily unique. The numbers produced by various algorithms are unique but guessable (the algorithm can be reverse-engineered). I don't know of a single algorithm that gives both properties, and because of the need to defy reverse engineering, it falls in the domain of cryptography. Non-experts, of course, shouldn't try to design cryptosystems.

Fortunately you don't have to get both properties from the same algorithm. Your gift card codes can consist of two parts: a part that is unique (generated using a linear congruential generator, perhaps, or modulo arithmetic, or even just an integer that you increment each time) and a part that is unguessable (just random numbers).

Jason Orendorff 2010-01-18 22:23:26

You've noted "unique" and "unguessable" but you're missing the property of verifiability. Your solution would work, although it would require a database storage to map each unique ID to a single valid random number.

Jason S 2010-01-18 22:45:19

Jason S: You have to store the codes in a database anyway, to record the amount of the gift and the amount already spent.

Jason Orendorff 2010-01-18 22:48:30

Answer 10

A:

Norman Ramsey 2010-01-22 20:02:28

ansaurus

tags:

views:

answers:

Algorithm for Gift Card Codes

related questions