ansaurus

Question

Generating large numbers of unique random codes

Answer 1

+3 A:

It really depends on the specific problem requirements. Do the codes have to be merely unique or also unpredictable? If they just have to be unique, then you can use a linear congruential random number generator to create your codes.

Wikipedia Page on Linear Congruential Generators

Here's some sample code:

class CodeGenerator
{
    public long Seed
    {
        get { return _value; }
        set { _value = value; }
    }

    private char[] alphabet =
        {
            '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
            'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
            'k', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u',
            'v', 'w'
        };

    public String GetCode()
    {
        // Generate the next value in the psuedo-random sequence.
        _value = (362881L * _value + 76552897L) & 0xFFFFFFFFFFFL;

        // Create the code.  Add 2^44 to avoid small codes.
        long code = _value + (1L << 44);

        StringBuilder builder = new StringBuilder("123456789");

        // The codes are all less than 2^45, so we have 45 bits of
        // information and need 9 digits.
        for (int i = 8; i >= 0; i--)
        {
            builder[i] = alphabet[code & 0x1F];
            code = code >> 5;
        }

        return builder.ToString();
    }

    private long _value = 0;
}

The class will generate a sequence of 2^44 codes before repeating (over 17 trillion codes). To resume the sequence, simply record the current Seed value and restore it when you need more codes.

Peter Ruderman 2009-11-24 00:48:50

Hey, good answer but unfortunately I wasn't clear enough - they do need to be non-predictable unfortunately.

Tim Schneider 2009-11-24 02:51:48

Doh! Oh well, you could always encrypt the output. :)

Peter Ruderman 2009-11-24 03:13:51

Then I lose the nice human-readable aspect, unless I shrink down the hash I create from it but at that stage I'll re-introduce the chances of collisions ;)

Tim Schneider 2009-11-24 04:31:51

Answer 2

+3 A:

Have you considered using GUIDs (uniqueidentifiers in SQL Server)? They are unique and mostly random. You can generate them on either the client-side or on the server.

You might also think about using a CLR function on the SQL side, to help minimize the number of DB round-trips.

To ensure uniqueness, one approach is to append a unique, non-random number (such as the value of an identity column) to your random numbers. The result isn't random at the bit-by-bit level, but it is random when taken as a whole.

Generating millions of unique random numbers won't take long. Inserting them into the DB will be the slow part....

RickNZ 2009-11-24 01:03:28

GUID's aren't an option as they will have to conform to the requirements on the contents of the string (Including specific lengths and excluded characters). The reason being they're intended to be typed in by users which means a GUID might be a little cruel.

Tim Schneider 2009-11-24 02:53:11

In that case, the rest of my answer still holds: to create unique, non-guessable random numbers, just create a "regular" random number, and then embed a unique number (such as a counter) in it somewhere, and convert the binary form to a string using whatever character encoding you're comfortable with. The individual random numbers can be generated by any number of means, such as the Random.Next() or RNGCryptoServiceProvider.GetBytes().

RickNZ 2009-11-24 03:41:48

Interesting idea... Would probably have to do a bit of bitwise operations to obfuscate the counter a little, but that's not so bad. Would have to ensure the counter was a large enough value to allow for future expansion (Given we don't necessarily know how many codes we'll need at the start) which could have a bit of an impact on the randomness of shorter phrases... might be a matter of coming up with a limit formula that maintains randomness. I'll have to have a bit of a think about this one... but definitely given me food for thought

Tim Schneider 2009-11-24 05:36:44

Answer 3

A:

Generate all of them? In your first case you have a total of 35 characters per a position. Total storage is (base^positions) - 1 so your total number of combinations on the low end is 36^9 - 1 or 101,559,956,668,415 which is nearly a TB if the codes are one byte long...which they are not. And that's at the low end.

The best system is to pre-generate batches of valid numbers and feed these into the insertions. If the method of generation is semi-random then You can do this easily by dividing the random spaces using segments of the bit array. But you don't mention how random is random.

Of course if you have complete control over the randomness then you could just use UUIDs, which is what we do.

nick 2009-11-24 01:10:50

When I say "all" I'm referring to "all one million codes" (Or whatever size the batch is), not "all possible codes". I'm more concerned with the implications of the "calculate in ram" approach when dealing with adding codes to a set that already has a few million (Obviously if I was trying to add 1 code it'd be faster to generate random codes and just check if they're in SQL for each code than to pull the list down in advance, but I'm not sure how the comparison would go if I wanted to generate another million)

Tim Schneider 2009-11-24 02:59:30

Answer 4

A:

For generating highly unpredictable random values, may I suggest using the System.Security.Cryptography.RNGCryptoServiceProvider class.

Sample code for generating abitrary length strings of random characters from a pre-defined set shown below. This has been used in a password generator.

private string GetRandomAlphanumericCharacters(int length)
{
    // Note: i, o, l, 0, and 1 have been removed to reduce 
    // chances of user typos and mis-communication of passwords.
    char[] allowedCharacters = { 'a', 'A', 'b', 'B', 'c', 'C', 'd', 'D', 'e', 'E', 'f', 'F', 'g', 'G', 'h', 'H', /*'i', 'I',*/ 'j', 'J', 'k', 'K', /*'l', 'L',*/ 'm', 'M', 'n', 'N', /*'o', 'O',*/ 'p', 'P', 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'v', 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z', /*'0', '1',*/ '2', '3', '4', '5', '6', '7', '8', '9' };

    // Create a byte array to hold the random bytes.
    byte[] randomNumber = new byte[length];

    // Create a new instance of the RNGCryptoServiceProvider.
    RNGCryptoServiceProvider Gen = new RNGCryptoServiceProvider();

    // Fill the array with a random value.
    Gen.GetBytes(randomNumber);

    string result = "";

    foreach (byte b in randomNumber)
    {
        // Convert the byte to an integer value to make the modulus operation easier.
        int rand = Convert.ToInt32(b);

        // Return the random number mod'ed.
        // This yeilds a possible value for each character in the allowable range.
        int value = rand % allowedCharacters.Length;

        char thisChar = allowedCharacters[value];

        result += thisChar;
    }

    return result;
}

saille 2009-12-08 02:15:14

Thanks, but the key part was Unique.

Tim Schneider 2009-12-08 03:07:41

Here's an alternate solution: generate some random salt bytes, append/prepend your unique value to the byte array, then encrypt. The result will be a random, unique byte array.

saille 2009-12-08 03:57:36

ansaurus

tags:

views:

answers:

Generating large numbers of unique random codes

Clarifications

related questions