Multi-Threading on multi core architecture

views:

193

answers:

+4 Q:

Multi-Threading on multi core architecture

When you have a situation where Thread A reads some global variable and Thread B writes to the same variable, now unless read/write is not atomic on a single core, you can do it without synchronizing, however what happens when running on a multi-core machine?

+1 A:

For a non-atomic operation on a multi-core machine, you need to use a system provided Mutex in order to synchronize the accesses.

For C++, the boost mutex library provides several mutex types that provide a consistent interface for OS-supplied mutex types.

If you choose to look at boost as your syncing / multithreading library, you should read up on the Synchronization concepts.

John Gietzen 2010-06-28 13:13:06

so OS does not sync between CPU cores?

Tony 2010-06-28 13:13:59

No, synchronization is NOT automatic.

John Gietzen 2010-06-28 13:16:07

+6 A:

It will have the same pitfalls as with a single core but with additional latency due to the L1 cache synchronization that must take place between cores.

Note - "you can do it without synchronizing" is not always a true statement.

Amardeep 2010-06-28 13:14:06

+1 for pointing out that this is unsafe even with a single-core machine.

2010-06-28 13:21:17

Depending on your situation the following may be relevant. While it won't make your program run incorrectly it can make a big difference in speed. Even if you aren't accessing the same memory location, you may get a performance hit due to cache effects if two cores are thrashing over the same page in the cache (though not the same location because you carefully synchronized your data structures).

There is a good overview of "false sharing" here: http://www.drdobbs.com/go-parallel/article/showArticle.jhtml;jsessionid=LIHTU4QIPKADTQE1GHRSKH4ATMY32JVN?articleID=217500206

Paul Rubel 2010-06-28 13:19:41

+7 A:

Even on a single core, you cannot assume that an operation will be atomic. That may be the case where you're coding in assembler but, if you are coding in C++ as per your question, you do not know what it will compile down to.

You should rely on the synchronisation primitives at the level of abstraction that you're coding to. In your case, that's the threading calls for C++. whether they be pthreads, Windows threads or something else entirely.

It's the same reasoning that I gave in another answer to do with whether i++ was thread-safe. The bottom line is, you don't know since you're not coding to that level (if you're doing inline assembler and/or you understand and can control what's going on under the covers, you're no longer coding at the C++ level and you can ignore my advice).

The operating system and/or OS-type libraries know a great deal about the environment they're running in, far more so than the C++ compiler would. Use of proper syncronisation primitives will save you a great deal of angst.

paxdiablo 2010-06-28 13:20:16

+1, really good point. Note that very strange things can happen when you code at a different level of abstraction. The compiler might even decide that you did not want to write that variable to memory in the first place! --i.e. it can cache it in a register and never generate instructions to write the variable back to main memory where it can be seen by other threads.

David Rodríguez - dribeas 2010-06-28 13:45:37

Of course, even if you're doing inline asm, you still can't be sure of synchronization, due to the CPU reordering instructions.

jalf 2010-06-28 14:08:14

@jalf, sure you can, just make every second instruction a serialising CPUID and hang the performance impacts :-)

paxdiablo 2010-06-28 14:16:52

@paxdiablo: touché ;)

jalf 2010-06-28 14:35:49

+5 A:

Even on a singlecore machine, there is absolutely no guarantee that this will work without explicit synchronization.

There are several reasons for this:

the OS may interrupt a thread at any time (between any two instructions), and then run the other thread, and
if there is no explicit synchronization, the compiler may reorder instructions very liberally, breaking any guarantees you thought you had, and
even the CPU may do the same, reordering instructions on the fly.

If you want correct communication between two threads, you need some kind of synchronization. Always, with no exception.

That synchronization may be a mutex provided by the OS or the threading API, or it may be CPU-specific atomic instructions, or just a plain memory barrier.

jalf 2010-06-28 13:25:06

As far as the (new) C++ standard is concerned, if a program contains a data race, the behavior of the program is undefined. A program has a data race if there is an interleaving of threads such that it contains two neighboring conflicting memory accesses from different threads (which is just a very formal way of saying "a program has a data race if two conflicting accesses can occur concurrently").

Note that it doesn't matter how many cores you're running on, the behavior of your program is undefined (notably the optimizer can reorder instructions as it sees fit).

avakar 2010-06-28 14:37:47

ansaurus

tags:

views:

answers:

Multi-Threading on multi core architecture

related questions