views:

957

answers:

5

It is easy to set memory barriers on the kernel side: the macros mb, wmb, rmb, etc. are always in place thanks to the Linux kernel headers.

How to accomplish this on the user side?

+2  A: 

You are looking for the full memory barrier atomic builtins of gcc.

Please note the detail on the reference i gave here says,

The [following] builtins are intended to be compatible with those described in the Intel Itanium Processor-specific Application Binary Interface, section 7.4. As such, they depart from the normal GCC practice of using the “__builtin_” prefix, and further that they are overloaded such that they work on multiple types.

nik
I'm not too familiar with this topic. Is this a processor specific functionality? (Since your example is an Itanium...)
Jeremy Powell
In general, users should not take advantage of platform- and compiler-specific functionality when there are standard, cross-platform mechanisms of achieving the same effect. What emg-2 really needs is to use the POSIX threads (pthreads) library.
Michael Aaron Safyan
@Michael, I completely agree with your opinion. That is the reason for highlighting platform specific notes.
nik
@Michael: the posix library does not provide mfence/sfence/lfence operations AFAIK. @nik: it is worth noting that the __sync_synchronize is broken prior to GCC 4.4: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793
+4  A: 

Posix defines a number of functions as acting as memory barriers. Memory locations must not be concurrently accessed; to prevent this, use synchronization - and that synchronization will also work as a barrier.

Martin v. Löwis
The synchronization is not needed when all that's needed is lock and wait free one writer/one reader queue. The POSIX libraries do not provide mfence/lfence/sfence operations AFAIK.
You didn't ask for lock free operation; you asked for memory barriers in user space. POSIX has them; they are called "pthread_mutex_lock", "pthread_mutex_unlock", etc. You may not like the model behind them, but that *is* an official answer to your question.
Martin v. Löwis
A: 

Linux x64 means you can use the Intel memory barrier instructions. You might wrap them in macros similar to those in the Linux headers, if those macros aren't appropriate or accessible to your code

Ira Baxter
I think this is the best option actually. The main flaw is the required maintenance of distinct compilers and past/future/non-Intel processors.
So what do you want? You don't like the portable solution, and you don't like the processor specific one.
Martin v. Löwis
A: 

The include/arch/qatomic_*.h headers of a recent Qt distribution include (LGPL) code for a lot of architectures and all kinds of memory barriers (acquire, release, both).

A: 

__sync_synchronize() in GCC 4.4+

The Intel Memory Ordering White Paper, a section from Volume 3A of Intel 64 and IA-32 manual http://developer.intel.com/Assets/PDF/manual/253668.pdf

osgx