ansaurus

Question

How can I create a unique 7-digit code for an entity?

Answer 1

+2 A:

i would suggest using a guid instead of a 7 digit code as it will be more unique and you don't have to worry about generateing them as .NET will do this for you.

harryovers 2010-02-11 15:05:42

`System.Guid.NewGuid();`

harryovers 2010-02-11 15:06:55

He said people have to share this number with other people. GUIDs are not ... shall we say ... *ideal* ... for that purpose.

T.J. Crowder 2010-02-11 15:07:09

Yes, exactly. For a bit more background, this is essentially a "group" that people can join. It's difficult to tell people to join group a53df3d0-171f-11df-8a39-0800200c9a66.

Damovisa 2010-02-11 15:11:06

@Damovisa: And you wouldn't want to join that group even if someone invited you to, would you? ;-) *ducks and runs*

T.J. Crowder 2010-02-11 15:27:34

Answer 2

+4 A:

Honestly, if you want to generate only a couple of thousand 7-digit codes, while 10 million different codes will be available, I think just generating a random one and checking for a collision is good enough.

The chance of a collision on the first hit will be, in the worst case scenario, about 1 in a thousand, and the computational effort to just generate a new 7-digit code and check for a collision again will be much smaller than keeping a dictionary, or similar solutions.

Using a GUID instead of a 7-digit code as harryovers suggested will also certainly work, but of course a GUID will be slightly harder to remember for your users.

Aistina 2010-02-11 15:08:44

Whilst in reality it is unlikely to happen, never forget that the actual worst case scenario for generating in this way is infinite.

Robin Day 2010-02-11 15:22:11

@Robin, luckily the chance for that is also infinitely small :)

Aistina 2010-02-11 20:44:02

Answer 3

+2 A:

All solutions for a "unique" ID must have a database somewhere: Either one which contains the used IDs or one with the free IDs. As you noticed, the database with free IDs will be pretty big so most often, people use a "used IDs" database and check for collisions.

That said, some databases offer a "random ID" generator/sequence which already returns IDs in a range in random order.

This works by using a random number generator which can create all numbers in a range without repeating itself plus the feature that you can save it's state somewhere. So what you do is run the generator once, use the ID and save the new state. For the next run, you load the state and reset the generator to the last state to get the next random ID.

Aaron Digulla 2010-02-11 15:09:13

Thanks, you're right of course - I'll have to store that "unique" Id somewhere. Do you have any more information on database random sequence generators? I'm currently using SQL Server Express 2008.

Damovisa 2010-02-11 15:13:15

@Damovisa: See SideShowCoder's answer about LFSRs.

Aaron Digulla 2010-02-11 17:08:38

Answer 4

+5 A:

You could use an incrementing ID and then XOR it on some fixed key.

const int XORCode = 12345;

private int Encode(int id)
{
    return id^XORCode;
}

private int Decode(int code)
{
    return code^XORCode;
}

Robin Day 2010-02-11 15:10:51

I'll have to investigate this further, but that sounds like it could work... You're still likely to get clusters of codes though aren't you?

Damovisa 2010-02-11 15:20:31

It depends on your ids and your choice of key. As you only expect ~10000 items then you should be able to play with the data and see what outcome you get.

Robin Day 2010-02-11 15:29:04

Answer 5

A:

With only thousands of items in the database, your original idea seems sound. Checking the existance of a value in a sorted (indexed) list of a few tens of thousands of items would only require a few data fetches and comparisons.

Pre-generating the list doesn't sound like a good idea, because you will either store way more numbers than are necessary, or you will have to deal with running out of them.

Jeffrey L Whitledge 2010-02-11 15:11:52

Well, he has only seven digits -- he'll run out of them in the same time whether he stores them in a database or not. :-)

T.J. Crowder 2010-02-11 15:14:03

Based on the question, it is unlikely that he will use up all ten million possible numbers. Therefore, it's wasteful to store all ten million. So how many does he need to store? A hundred thousand is also wasteful. Ten thousand is probably not enough. Finding the proper balance is a trade-off, and it doesn't sound like a good plan to me. I prefer solutions that are guaranteed to operate well under all potential scenerios.

Jeffrey L Whitledge 2010-02-11 15:22:31

Answer 6

+2 A:

I assume you'll have a table of the generated ones. In that case, I don't see a problem with picking random numbers and checking them against the database, but I wouldn't do it individually. Generating them is cheap, doing the DB query is expensive relative to that. I'd generate 100 or 1,000 at a time and then ask the DB which of those exists. Bet you won't have to do it twice most of the time.

T.J. Crowder 2010-02-11 15:13:25

Not a bad idea...

Damovisa 2010-02-11 15:14:48

Answer 7

+11 A:

Pick a 7-digit prime number A, and a big prime number B, and

int nth_unique_7_digit_code(int n) {
    return (n * B) % A;
}

The count of all unique codes generated by this will be A.

If you want to be more "secure", do pow(some_prime_number, n) % A, i.e.

static int current_code = B;
int get_next_unique_code() {
   current_code = (B * current_code) % A;
   return current_code;
}

KennyTM 2010-02-11 15:15:51

Do you have any background info on why this works?

spoulson 2010-02-11 15:32:53

I'm not entirely sure what this is doing, but it looks vaguely similar to the important part of the RSA algorithm. I think.

rmeador 2010-02-11 15:35:30

There is no point in picking a B that is larger than A.

GregS 2010-02-11 15:36:46

It is elementary number theory, and it works because A is prime and the GCD(A,B) = 1. It guarantees no repeats, and the result "looks" random.

GregS 2010-02-11 15:39:27

@GregS: Picking B >> A makes the pattern not so clear. For the `pow` method it doesn't matter.

KennyTM 2010-02-11 15:41:31

Provided performance on this isn't terrible (I can't see it being too bad), I think I'll end up using this one. Thanks!

Damovisa 2010-02-11 23:26:43

@KennyTM: your B and (B % A) are equivalent, so you might as well pick the smaller (B % A).

GregS 2010-02-13 00:56:17

Note that given a few consecutive numbers you can easily calculate A and B, so this is in no way cryptographically secure. But if the goal is just to *look* random, then it should work as advertised.

Rasmus Faber 2010-02-17 16:56:41

Thanks! I needed something like that. I used [node id] + [timestamp] + [the unique code]. All this nicely base58 encoded into 10 chars for my DB key. For the number of keys a node can generate I'm covered. Spanks alot!

Derick 2010-04-11 21:11:18

Answer 8

A:

Probability of having hits is very low.
For instance - you have 10^4 users and 10^7 possible IDs.
Probability that you pick used ID 10 times in row is now 10^-30.
This chance is lower than once in a lifetime of any person.

ralu 2010-02-11 15:19:05

Ahem, birthday paradox.

KennyTM 2010-02-11 15:20:38

It does not apply here. He should add the code for checking hits but there is low probability that he will have to try more than few times.

ralu 2010-02-11 15:23:37

Answer 9

+2 A:

You have <10.000 items, so you need only 4 digits to store a unique number for all items. Since you have 7 digits, you have 3 digits extra.

If you combine a unique sequence number of 4 digits with a random number of 3 digits, you will be unique and random. You increment the sequence number with every new ID you generate.

You can just append them in any order, or mix them.

seq = abcd, rnd = ABC

You can create the following ID's:

abcdABC
ABCabcd
aAbBcCd

If you use only one mixing algorithm, you will have unique numbers, that look random.

GvS 2010-02-11 15:48:22

Answer 10

+1 A:

I would try to use an LFSR (Linear feedback shift register) the code is really simple you can find examples everywhere ie Wikipedia and even though it's not cryptographically secure it looks very random. Also the implementation will be very fast since it's using mainly shift operations.

SideShowCoder 2010-02-11 15:50:53

Answer 11

A:

Well, you could ask the user to pick their own 7-digit number and validate it against the population of existing numbers (which you would have stored as they were used up), but I suspect you would be filtering a lot of 1234567, 7654321, 9999999, 7777777 type responses and might need a few RegExs to achieve the filtering, plus you'd have to warn the user against such sequences in order not to have a bad, repetitive, user input experience.

Gordon Mackie JoanMiro 2010-02-11 16:02:24

ansaurus

tags:

views:

answers:

How can I create a unique 7-digit code for an entity?

related questions