views:

316

answers:

6

Before I start, I want to point out that I'm pretty sure this actually happened. All my logs suggest that it did.

I'd like to know whether I'm wrong and this is impossible, whether it's just incredibly unlikely (which I suspect), or if it's not that unlikely and I'm doing something fundamentally wrong.

I have 4 instances of the same code running as Windows Services on the same server. This server has a multicore (4) processor.

Here's a summary of the code:

public class MyProcess
{
    private System.Timers.Timer timer;

    // execution starts here
    public void EntryPoint()
    {
        timer = new System.Timers.Timer(15000);  // 15 seconds
        timer.Elapsed += new System.Timers.ElapsedEventHandler(Timer_Elapsed);
        timer.AutoReset = false;

        Timer_Elapsed(this, null);
    }

    private void Timer_Elapsed(object sender, System.Timers.ElapsedEventArgs e)
    {
        string uid = GetUID();

        // this bit of code sends a message to an external process.
        //  It uses the uid as an identifier - these shouldn't clash!
        CommunicationClass.SendMessage(uid);

        timer.Start();
    }

    // returns an 18 digit number as a string
    private string GetUID()
    {
        string rndString = "";
        Random rnd = new Random((int)DateTime.Now.Ticks);
        for (int i = 0; i < 18; i++)
        {
            rndString += rnd.Next(0, 10);
        }
        return rndString;
    }

The external process that receives these messages got confused - I think because the same uid came from two separate processes. Based on that, it appears that the GetUID() method returned the same "random" 18 digit string for two separate processes.

I've seeded the Random class using DateTime.Now.Ticks which I thought would provide protection between threads - one tick is 100 nanoseconds, surely two threads couldn't get the same seed value.

What I didn't account for obviously is that we're not talking about threads, we're talking about processes on a multicore processor. That means that this code can literally run twice at the same time. I think that's what's caused the clash.

Two processes running the same code at approximate 15 second intervals managed to hit the same code inside 100 nanoseconds. Is this possible? Am I on the right track here?

I'd be grateful for your thoughts or suggestions.


To clarify, I can't really use a GUID - the external process I'm communicating with needs an 18 digit number. It's old, and I can't change it unfortunately.

+7  A: 

Unless there is some reason you can't, you should look at using a GUID for this purpose. You will eliminate collisions that way.

Per comment: You could use a GUID and a 64-bit FNV hash and use XOR-folding to get your result to within the 59-bits that you have. Not as collision proof as a GUID, but better than what you have.

JP Alioto
The difficulty is that the legacy process I'm talking to needs an 18 digit number. I would use a GUID, but I can't.
Damovisa
Thanks for the update. I think the moral of the story is don't use random when you mean unique...
Damovisa
I think the moral is "don't feed Random with a time-based seed in a multi-threaded or multi-process scenario".
MusiGenesis
+1  A: 

Yip it can happen, and therefore it did.

You should initialise Random only once, at start-up. If you have many threads starting at the same time, get a copy of DateTime.Now.Ticks and pass it to each thread with a known offset to prevent then initialising on the same time.

Simeon Pilgrim
They're not threads though. Each of these is running in a completely separate process.
Damovisa
+5  A: 

You don't want random numbers for this purpose, you want unique numbers. I'm with @JP. I think you should look at using GUIDs for your message ids.

EDIT: if you can't use a GUID, then think of a way to get a 64-bit number that is unique and use successive 3-bit chunks of it as an index into a 8-character alphabet (tossing the unused upper bits). One way to do this would be to have a database in which you create an entry for each new message with an auto-incremented 64-bit integer as the key. Use the key and translate it into your 18-character message id.

If you don't want to rely on a database you can get something that works under certain conditions. For example, if the messages only need to be unique during the life of the processes, then you could use the process id as 32 bits of the value and get the remaining 22 required bits from a random number generator. Since no two processes running at the same time can have the same id, they should be guaranteed to have unique message ids.

There are undoubtedly many other ways that you could do this if your situation doesn't fit into one of the above scenarios.

tvanfosson
I concur. You should use a GUID.
Babak Naffas
See comment on JP's answer - can't use a GUID.
Damovisa
@Damovisa -- I added some detail to my answer on how to do this if you can't use GUIDs.
tvanfosson
Thanks tvanfosson, some good suggestions in there. I've already changed the code to use the process id in the seed but I might embed it in the random number itself to make sure. Thanks.
Damovisa
+3  A: 

Try using this function for the seed, in place of DateTime.Now.Ticks:

public static int GetSeed()
{
    byte[] raw = Guid.NewGuid().ToByteArray();
    int i1 = BitConverter.ToInt32(raw, 0);
    int i2 = BitConverter.ToInt32(raw, 4);
    int i3 = BitConverter.ToInt32(raw, 8);
    int i4 = BitConverter.ToInt32(raw, 12);
    long val = i1 + i2 + i3 + i4;
    while (val > int.MaxValue)
    {
        val -= int.MaxValue;
    }
    return (int)val;
}

This basically turns a Guid into an int. You could theoretically get duplicates, but it's cosmically unlikely.

Edit: or even just use:

Guid.NewGuid().GetHashCode();

Using DateTime.Now.Ticks, on the other hand, almost guarantees a collision at some point. It's very common in Windows programming to have a timer's resolution specified in units that are far beyond the timer's actual accuracy (I first ran into this with Visual Basic 3.0's timer control, which was set in milliseconds but only actually went off 18 times a second). I don't know this for sure, but I bet if you just ran a loop and printed out DateTime.Now.Ticks, you would see the values quantizing around 15ms intervals or so. So with 4 processes going it's actually really likely that two of them would end up using the exact same seed for the Random function.

Since the Guid-based GetSeed function has a super-tiny chance of producing duplicates, ideally you'd want to create some sort of bank of pre-calculated unique numbers. However, since you're talking about separate processes here, you'd have to come up with some way of caching the values where all the processes could read them, which is a bother.

If you want to worry about cosmically unlikely events, buy lottery tickets.

MusiGenesis
Interestingly, my investigations (post event) showed that the resolution of DateTime.Now in practice is about 15ms... Makes you wonder why they offer Ticks...
Damovisa
Why did Visual Basic use twips instead of pixels? Because Bill Gates hates us personally.
MusiGenesis
+3  A: 

Another way to accomplish this is to not use the Random class as it is fraught with issues like this. You can accomplish the same functionality (a random 18 digit number) using the cryptographic quality random number generator available in System.Security.Cryptography.

I have modified your code to use the RNGCryptoServiceProvider class to generate the id.

// returns an 18 digit number as a string
private string GetUID()
{
    string rndString = "";
    var rnd = new RNGCryptoServiceProvider();
    var data = new byte[18];
    rnd.GetBytes(data); 
    foreach(byte item in data)
    {
        rndString += Convert.ToString((int)item % 10);
    }
    return rndString;
}
Joe Kuemerle
I hadn't heard of that class before - thanks, it could definitely be useful!
Damovisa
A: 

I also concur with the GUID idea.

As for your original problem, since Ticks is a long, this statement:

(int)DateTime.Now.Ticks

will cause an overflow. Not sure what kinda nastiness will happen then...

Ray
It looks casting a long to int just wraps around. It doesn't throw an exception, so it would be OK as a seed.
MusiGenesis
No exception. But the wrapping increases the chances of collision, especially if the timer is quantized.
Ray
Yep, no exception and it just wraps. The value space is that of an int - 32 bits.
Damovisa
I suspect the value space is less than 32 bits because of the timer quantization...
Ray