views:

184

answers:

5

I have a large array of structures, like this:

typedef struct
{
    int a;
    int b;
    int c;
    etc...
}
data_type;

data_type data[100000];

I have a bunch of separate threads, each of which will want to make alterations to elements within data[]. I need to make sure that no to threads attempt to access the same data element at the same time. To be precise: one thread performing data[475].a = 3; and another thread performing data[475].b = 7; at the same time is not allowed, but one thread performing data[475].a = 3; while another thread performs data[476].a = 7; is allowed. The program is highly speed critical. My plan is to make a separate critical section for each data element like so:

typedef struct
{
    CRITICAL_SECTION critsec;
    int a;
    int b;
    int c;
    etc...
}
data_type;

In one way I guess it should all work and I should have no real questions, but not having had much experience in multithreaded programming I am just feeling a little uneasy about having so many critical sections. I'm wondering if the sheer number of them could be creating some sort of inefficiency. I'm also wondering if perhaps some other multithreading technique could be faster? Should I just relax and go ahead with plan A?

A: 

You could also consider MUTEX. This is nice method. Each client could reserve the resource by itself with mutex (mutual-exclusion).

This is more common, some libraries also support this with threads. Read about boost::thread and it's mutexes

With Your approach:

data_type data[100000];

I'd be afraid of stack overflow, unless You're allocating it at the heap.

EDIT:

Boost::MUTEX uses win32 Critical Sections

bua
I thought mutexes were supposed to be much slower?
Mick
Don't worry. Its on the heap, but thanks for the heads up.
Mick
A critical section is implemented as a spinlock with a fixed spin count, after which it grabs a mutex.
Christopher
I've created large scalable tool with boost mutexes (high performance was the requirement), and there were no problems.
bua
on msdn it says: "critical section objects provide a slightly faster, more efficient mechanism for mutual-exclusion synchronization" - see http://msdn.microsoft.com/en-us/library/ms682530%28VS.85%29.aspx
Mick
In the book "Multithreading application in win32" it says: "It takes almost 100 times longer to lock an unowned mutex than it does to lock an unowned critical section because the critical section can be done in user mode without involving the kernal"
Mick
-1, indeed a bad idea.
MSalters
I assume that this is good idea, since boost::mutex uses CriticalSections on windows.
bua
Confusion of terms. In Microsoft Windows context (which we can infer from the tags), mutexes are cross-process kernel-level synchronization objects whereas Critical Sections are in-process and often require no kernel-mode switch. boost::mutex is closer to the CompSci concept of MUTual EXclusion, which doesn't deal with processes at all.
MSalters
Yes - it's quite a pain to try to talk to someone in Computer Science terms when all they understand is what they've read on MSDN.
Tom
Or vice versa :)
MSalters
+3  A: 

With this many objects, most of their critical sections will be unlocked, and there will be almost no contention. As you already know (other comment), critical sections don't require a kernel-mode transition if they're unowned. That makes critical sections efficient for this situation.

The only other consideration would be whether you would want the critical sections inside your objects or in another array. Locality of reference is a good reason to put the critical sections inside the object. When you've entered the critical section, an entire cacheline (e.g. 16 or 32 bytes) will be in memory. With a bit of padding, you can make sure each object starts on a cacheline. As a result, the object will be (partially) in cache once its critical section is entered.

MSalters
A: 

Your plan is worth trying, but I think you will find that Windows is unhappy creating that many Critical Sections. Each CS contains some kernel handle(s) and you are using up precious kernel space. I think, depending on your version of Windows, you will run out of handle memory and InitializeCriticalSection() or some other function will start to fail.

What you might want to do is have a pool of CSs available for use, and store a pointer to the 'in use' CS inside your struct. But then this gets tricky quite quickly and you will need to use Atomic operations to set/clear the CS pointer (to atomically flag the array entry as 'in use'). Might also need some reference counting, etc...

Gets complicated.

So try your way first, and see what happens. We had a similar situation once, and we had to go with a pool, but maybe things have changed since then.

tony
"Each CS contains some kernel handle(s)" are you sure? The other answers imply that they don't. Are you sure you're not getting mixed up with mutexes?
Mick
I could.But I think the idea is that CSes, *IIRC* avoid the kernel when there is no contention (ie just a single atomic op to gain ownership of an unowned CS), but if there *is* contention, well then you need to go to the kernel.And it has been a while since I read SysInternals or similar, but I think there was lots of extra other stuff in there as well - bookkeeping, almost debug-level stuff.
tony
+1  A: 

Depending on the data member types in your data_type structure (and also depending on the operations you want to perform on those members), you might be able to forgo using a separate synchronization object, using the Interlocked functions instead.

In your sample code, all the data members are integers, and all the operations are assignments (and presumably reads), so you could use InterlockedExchange() to set the values atomically and InterlockedCompareExchange() to read the values atomically.

If you need to use non-integer data member types, or if you need to perform more complex operations, or if you need to coordinate atomic access to more than one operation at a time (e.g., read data[1].a and then write data[1].b), then you will have to use a synchronization object, such as a CRITICAL_SECTION.

If you must use a synchronization object, I recommend that you consider partitioning your data set into subsets and use a single synchronization object per subset. For example, you might consider using one CRITICAL_SECTION for each span of 1000 elements in the data array.

Paul Place
A: 

As others have pointed out, yes there is an issue and it is called too fine-grained locking.. it's resource wasteful and even though the chances are small you will start creating a lot of backing primitives and data when the things do get an occasional, call it longer than usual or whatever, contention. Plus you are wasting resources as it is not really a trivial data structure as for example in VM impls..

If I recall correctly you will have a higher chance of a SEH exception from that point onwards on Win32 or just higher memory usage. Partitioning and pooling them is probably the way to go but it is a more complex implementation. Paritioning on something else (re:action) and expecting some short-lived contention is another way to deal with it.

In any case, it is a problem of resource management with what you have right now.

rama-jka toti