ansaurus

Question

Answer 1

+10 A:

Locks are often the simplest way of getting things right, which is very important. Quite often it doesn't matter if there's a more efficient way of doing things, so long as you've got clear code and it performs well enough.

A more performant approach here, however, would be to either generate a random GUID, or if you do want to reuse IDs, have a "pool" (e.g. a LinkedList) of unused IDs. You could then take from the pool very quickly, and return the ID to the pool (again quickly) once you're done.

Alternatively, if you really just need an integer and it doesn't have to be a low one, you could have a static variable which starts at 0 and which you just increment each time - you can do this without locking using Interlocked.Increment should you wish. I doubt that you'll run out of 64-bit integers, for example :)

As for your second question: yes, locks are advisory. If everything within the class takes out the same lock before changing any fields (and the fields are private) then that prevents other code from misbehaving... but each bit of code does need to take out the lock.

EDIT: If you really only need an integer, I would still suggest just using Interlocked.Increment - even if your traffic increases 1000-fold, you could use a 64 bit integer instead. However, if you want to reuse IDs then I'd suggest creating a new type to represent the "pool". Give that a counter of how many have been created, so that if you run out you can assign a new item. Then just store the available ones in a Queue<int>, LinkedList<int> or Stack<int> (it's not going to matter very much which you use). Assuming you can trust your own code to return IDs sensibly, you can make the API as simple as:

int AllocateID()
void ReturnID(int id)

AllocateID would check to see if the pool is empty, and allocate a new ID if so. Otherwise, it would just remove the first entry from the pool and return that. ReturnID would just add the specified ID to the pool.

Jon Skeet 2009-12-21 13:18:32

Thank you! Well the IDs are then send to the clients, it has to be reusable as clients are often reconnecting (1000 per hour is typical). I am intersted in your "pool" solution, could you please more specify? Thanks!

Tomas 2009-12-21 13:23:20

@Tomas: I don't see how 1000 per hour means they have to be reusable. Even if you use an Int32 starting from 0, that would still give you 245 *years* of unique IDs at 1000 per hour... using Interlocked.Increment really would be simpler here.

Jon Skeet 2009-12-21 13:28:07

Yes, I would not run out of IDs, I am rather interested in not sending large numbers to clients, there are a lot of operations with it.

Tomas 2009-12-21 13:31:26

@Tomas: Okay. Have updated answer to expand on the pool idea.

Jon Skeet 2009-12-21 13:33:01

Great, thanks! Gonna try it.

Tomas 2009-12-21 13:36:13

Can do better than interlocked increment. Thread local storage.

Hassan Syed 2009-12-21 13:48:54

So much talk about the Y10K problem, and still we're satisfied with a lousy 245 years ^^

cwap 2009-12-21 14:00:24

I would have one more question - is there any method that will take the item from the qeue automatically? It sounds to me a bit slow if I need to remove the object from the qeue and return it. Thanks

Tomas 2009-12-21 14:10:16

@Hassan: How would thread-local storage help to allocate unique IDs here? Also, how certain are you that it would be faster in the first place? @Tomas: "Sounds a bit slow" is never a good start: benchmark it. Linked lists make adding/removing from each end very, very quick.

Jon Skeet 2009-12-21 15:56:09

@jon,@hassan: It will definitely be much more quick than iterating through an array of 500 elements which you had first ;^)

Toad 2009-12-21 16:27:00

@Tomas: Why is Int32.MaxValue a large number?? If you store it as integer or sent it as binary you'll always using 32-bit. And if you send it as string over the wire you will sent a maximum of ten bytes, instead of four or five.

Oliver 2009-12-22 07:34:45

@Jon, please follow my link and read about SMP and interlocked instructions. Interlocked instructions are almost as bad as a general lock. High level programmers have an incorrect view of synchronization and are misled by the "atomic instruction" propaganda. Thread local storage is memory assigned to the thread -- i.e., synchronized by design.

Hassan Syed 2009-12-22 13:17:24

@Hassan: No, it's not synchronized by design - it's local to the thread. That *doesn't* help when you want global UIDs, unless you use some sort of hi/lo algorithm (which is looks like you're talking about). However, I would *seriously* question whether it's really appropriate to introduce that extra complexity in this case. Allocating a thousand IDs an hour is hardly going to tax the processor... why not stick to a simpler solution?

Jon Skeet 2009-12-22 14:08:48

Answer 2

+1 A:

You are locking while scanning through an array.

You'd be better off having 2 stacks. One is one with free id's, and one is with ID's in use. That way you can just pop one of the first stack and push it on the second.

This way you are locking much less long.

Toad 2009-12-21 13:20:00

Answer 3

A:

I'd suggest you use GUIDs, unless you are required to use a short. Guid.NewGuid() is thread safe, so you eliminate the need for locks or other synchronization mechanisms.

Håvard S 2009-12-21 13:20:44

You'd still need to lock and check the count of clients connected if a there's a 500 maximum

Paolo 2009-12-21 13:23:07

And I need to have as small numbers as possible, those are send back to clients.

Tomas 2009-12-21 13:27:24

My suggestion was based on the assumption that you simply needed a unique ID. If the 500 limit and low numbers is a requirement, you'll need a pool of free IDs and like Jon suggests.

Håvard S 2009-12-21 13:34:57

GUID's are nasty slow. Even when generated serially.

Hassan Syed 2009-12-21 13:55:46

I'm not the only one suggesting GUIDs here, and they solve the problem, so why the downvote?

Håvard S 2009-12-21 14:10:17

I hope I did not give it to you, I am unable to give those points (says I need to register or login). Anyway, it was said that the numbers have to be reusable so maybe that is the reason.

Tomas 2009-12-21 14:14:58

GUID's are an order of magnitude slower than custom in-process UID schemes. If someone asks you how to make his motorcycle faster you shouldn't suggest removing the engine to make it lighter. Such a suggestion would be entirely inappropriate and off-topic, just like what you suggested.

Hassan Syed 2009-12-22 15:16:39

Answer 4

A:

Do you care about the ID returned? You could either increment the client ID using Interlocked.Increment each time or generate GUID (the former is likely to be faster).

Then use a simple counter to track how many clients are connected rather than scanning the array every time.

Paolo 2009-12-21 13:22:25

Answer 5

+1 A:

You can allocate state on thread local memory. Thread local memory is thread safe (as long as you don't pass pointers arround).

You can use two integers to generate the unique number and only one is a synchronized number.

Integer 1: a incrementing integer which represents the thread, a new number is generated each time you initialize a thread (which should be a rare event).

Integer2: on thread initialization you start this integer on 0.

You will use both integers -- which are stored in the thread local memory -- as the unique integer, and integer 2 will be incremented normally (unlocked).

This way the generation of unique integers is absolutely thread safe -- i.e., you don't have to use a atomic CPU instruction -- Interlocked.increment (which does cause hardware level performance penalties).

-- edit : cache coherence --

from :

Cache Coherence

To decrease time required for memory access different caches are used: recently accessed memory duplicated in CPU cache which is significantly faster than common memory. Future access to the same address will use data saved in cache, decreasing fetch time. The problems appear in SMP (symmetric multiprocessing) systems, where there are several processors have own cache memory: when one processor changes variable in memory region, used by several processors simultaneously, it actually changes own copy of a variable, located in cache, while shared variable still has original value. This problem could not be solved by using volatile keyword on a shared variable, since this will only guarantee that write-to-memory instruction will present in resulting program, but operations related to cache are still not specified. Of course, it is possible to disable CPU cache, mapping memory as no-cache (PAGE_NOCACHE protection flag in VirtualAlloc() Win32 API function), but along with significant slowdown this imposes some limitations: for example, interlocked instructions may raise hardware exception on no-cache memory.

For correct work of SMP systems data which is stored in cache of more than one processor should be the same in all caches. This means that CPU caches must be synchronized (kept coherent) at a hardware level**. But it is important to note that cache synchronization (flow of cache coherency traffic) is made asynchronously with program execution:** when one CPU changes value of a shared variable another CPU temporarily observes old value. This means CPU continues execution without waiting for cache coherency operation to be completed. Furthermore, if two variables (a then b) were changed by first CPU, another CPU could observe that b has changed earlier than a.

Interlocked instructions have considerable differences on this point. Exactly, interlocked instruction is a command to make something directly on a physical memory under a locked bus. This means that caches inconsistency does not affects programs where shared variables are accessed with only interlocked instructions (note that both processes, that who reads a variable and that writes to it, should use interlocked instructions).

-- edit : Further clarrification: --

Under your current design Interlocked increment is indeed your best bet, but it is far from ideal. Your CPU has a VERY FAST cache onchip (often clocked at the same speed as the CPU). If you have memory local to your thread it will be pulled into the CPU your thread is on which means your CPU won't have to go to main memory and it can fly at full-speed.

If you use interlocked increment your CPU will have to

lock the bus.
Increase the 32-bit word.
release the bus.

You can do without this. I might appear to be acting pedantic, as the overhead might only be a 100% decrease in relative performance. However in a industrial server app with a 4 physical CPU's and 16 cores with this UID generator firing on every request... trust you me, your bus will be screwed. Micro-optimization is an important area in programming, especially as we are scaling horizontally now.

Hassan Syed 2009-12-21 13:47:42

Interlocked.Increment is a single CPU instruction on modern CPUs, unlike externally grabbing a lock and then incrementing the value.

Paolo 2009-12-21 18:58:19

Yes it is a single instruction, but it locks the bus only to alter one word.

Hassan Syed 2009-12-22 13:23:47

@Hassan: Please bear in mind the context here - this is *not* something which has to be optimised to within an inch of its life. Simplicity wins over micro-optimisation until you've hit a bottleneck, IMO.

Jon Skeet 2009-12-22 14:09:53

I agree Jon, there is probably nothing to be gained by optimizing a business app in terms of synchronization. However if the question is asked we should give the most comprehensive answer, after all the climate of programming is changing (multi core UMA machines), and truly understanding the programming model leads to a real competitive advantage in the programming world.

Hassan Syed 2009-12-22 15:09:30

ansaurus

tags:

views:

answers:

2 questions on locks performance/use

related questions