views:

157

answers:

5

I need to provide synchronization to some members of a structure.
If the structure is something like this

struct SharedStruct {
    int Value1;
    int Value2;
}

and I have a global variable

SharedStruct obj;

I want that the write from a processor

 obj.Value1 = 5; // Processor B

to be immediately visible to the other processors, so that when I test the value

 if(obj.Value1 == 5) { DoSmth(); } // Processor A
 else DoSmthElse();

to get the new value, not some old value from the cache.
First I though that if I use volatile when writing/reading the values, it is enough. But I read that volatile can't solve this kind o issues.
The members are guaranteed to be properly aligned on 2/4/8 byte boundaries, and writes should be atomic in this case, but I'm not sure how the cache could interfere with this.
Using memory barriers (mfence, sfence, etc.) would be enough ? Or some interlocked operations are required ?
Or maybe something like

lock mov addr, REGISTER

?
The easiest would obviously be some locking mechanism, but speed is critical and can't afford locks :(

Edit
Maybe I should clarify a bit. The value is set only once (behaves like a flag). All the other threads need just to read it. That's why I think that it may be a way to force the read of this new value without using locks.

Thanks in advance!

+7  A: 

There Ain't No Such Thing As A Free Lunch. If your data is being accessed from multiple threads, and it is necessary that updates are immediately visible by those other threads, then you have to protect the shared struct by a mutex, or a readers/writers lock, or some similar mechanism.

Performance is a valid concern when synchronizing code, but it is trumped by correctness. Generally speaking, aim for correctness first and then profile your code. Worrying about performance when you haven't yet nailed down correctness is premature optimization.

peterb
+2  A: 

I second peterb's answer to aim for correctness first. Yes, you can use memory barriers here, but they will not do what you want.

You said immediately. However, how immediate this update ever can be, you could (and will) end up with the if() clause being executed, then the flag being set, and than the DoSmthElse() being executed afterwards. This is called a race condition...

You want to synchronize something, it seems, but it is not this flag.

gimpf
Moral: threads are a lot like frames of reference in special relativity: what's "immediate" in one is not guaranteed to be "immediate" in another ;-)
Steve Jessop
A: 

Making the field volatile should make the change "immediately" visible in other threads, but there is no guarantee that the instant at which thread A executes the update doesn't occur after thread B tests the value but before thread B executes the body of the if/else statement.

It sounds like what you really want to do is make that if/else statement atomic, and that will require either a lock, or an algorithm that is tolerant of this sort of situation.

Tyler McHenry
The statement is somewhat atomic... a Spinlock made by me (I can't afford using so much memory like a CRITICAL_SECTION does. This is part of a memory allocator I'm doing for Uni) protects the region of code, including the if/else statement. But I'm not sure that when I read the value with volatile, I get the new value or something from the cache.
lgratian
Trying to do concurrent access to data without using locks is going to fail. It will not work. At all. Every functional malloc implementation in the entire world that has ever been written since the beginning of time uses critical sections. There's a reason for this.
peterb
volatile will not solve memory synchronization issues. The problem volatile solves is completely orthogonal to synchronization (except in Java and C#).
caspin
@caspin ...which is what I said. My point was that the OP's question about "immediate visibility" of the change was the wrong question, since he can get that but it doesn't solve his problem.
Tyler McHenry
@peterb There certainly exist such things as lock-free and wait-free algorithms that can work without critical sections in the presence of appropriate hardware support. The question doesn't give enough information about the problem to determine if such a thing could be used in the OP's case.
Tyler McHenry
What I am trying to achieve is not a lock-free algorithm. The region of code is locked by my Spinlock. I am just not sure if the change will be visible on other threads (after they acquire the same lock). I need to now what, for example, CRITICAL_SECTION does to ensure that the change is visible, and implemented in my memory-constrained case.
lgratian
Then volatile will be what you want. That causes the compiler to modify the variable directly in memory, and read directly from memory, without any form of local caching. A volatile variable will always have its "most recent" value available to all threads, although what "most recent" means in a local context is dependent on synchronization. If you spinlock the variable each time you access it, then marking it volatile should be all you need to do to maintain agreement on what its value is.
Tyler McHenry
I can't understand why this answer was downvoted; yes, its a dupe, but this seems modern. However, Igration, it usually is helpful for others to also update the question, and not only giving information in comments; because now the correct answer is the last comment of the last answer, which cannot be the intention.
gimpf
+2  A: 

Use explicitly atomic instructions. I believe most compilers offer these as intrinsics. Compare and Exchange is another good one.

If you intend to write a lockless algorithm, you need to write it so that your changes only take effect when conditions are as expected.

For example, if you intend to insert a linked list object, use the compare/exchange stuff so that it only inserts if the pointer still points at the same location when you actually do the update.

Or if you are going to decrement a reference count and free the memory at count 0, you will want to pre-free it by making it unavailable somehow, check that the count is still 0 and then really free it. Or something like that.

Using a lock, operate, unlock design is generally a lot easier. The lock-free algorithms are really difficult to get right.

Zan Lynx
+3  A: 

All the other answers here seem to hand wave about the complexities of updating shared variables using mutexes, etc. It is true that you want the update to be atomic. And you could use various OS primitives to ensure that, and that would be good programming style.

However, on most modern processors (certainly the x86), writes of small, aligned scalar values is atomic and immediately visible to other processors due to cache coherency. So in this special case, you don't need all the synchronizing junk; the hardware does the atomic operation for you. Certainly this is safe with 4 byte values (e.g., "int" in 32 bit C compilers).

So you could just initialzie Value1 with an uninteresting value (say 0) before you start the parallel threads, and simply write other values there. If the question is exiting the loop on a fixed value (e.g., if value1 == 5) this will be perfectly safe.

If you insist on capturing the first value written, this won't work. But if you have a parallel set of threads, and any value written other than the uninteresting one will do, this is also fine.

Ira Baxter
Thanks for the answer. Something like this I was hoping to hear :)
lgratian