views:

376

answers:

4

When using shared memory, each process may mmap the shared region into a different area of its respective address space. This means that when storing pointers within the shared region, you need to store them as offsets of the start of the shared region. Unfortunately, this complicates use of atomic instructions (e.g. if you're trying to write a lock free algorithm). For example, say you have a bunch of reference counted nodes in shared memory, created by a single writer. The writer periodically atomically updates a pointer 'p' to point to a valid node with positive reference count. Readers want to atomically write to 'p' because it points to the beginning of a node (a struct) whose first element is a reference count. Since p always points to a valid node, incrementing the ref count is safe, and makes it safe to dereference 'p' and access other members. However, this all only works when everything is in the same address space. If the nodes and the 'p' pointer are stored in shared memory, then clients suffer a race condition:

  1. x = read p
  2. y = x + offset
  3. Increment refcount at y

During step 2, p may change and x may no longer point to a valid node. The only workaround I can think of is somehow forcing all processes to agree on where to map the shared memory, so that real pointers rather than offsets can be stored in the mmap'd region. Is there any way to do that? I see MAP_FIXED in the mmap documentation, but I don't know how I could pick an address that would be safe.

Edit: Using inline assembly and the 'lock' prefix on x86 maybe it's possible to build a "increment ptr X with offset Y by value Z"? Equivalent options on other architectures? Haven't written a lot of assembly, don't know if the needed instructions exist.

+2  A: 

We have code that's similar to your problem description. We use a memory-mapped file, offsets, and file locking. We haven't found an alternative.

Steve Emmerson
+1  A: 

On low level the x86 atomic inctruction can do all this tree steps at once:

  1. x = read p
  2. y = x + offset Increment
  3. refcount at y
//
      mov  edi, Destination
      mov  edx, DataOffset
      mov  ecx, NewData
 @Repeat:
      mov  eax, [edi + edx]    //load OldData
//Here you can also increment eax and save to [edi + edx]          
      lock cmpxchg dword ptr [edi + edx], ecx
      jnz  @Repeat
//
GJ
If cmpxchg already does an atomic read and atomic write, is the 'lock' necessary? Or does that ensure that the edi + edx is done atomically? I've only ever really used MIPS assembly.
Joseph Garvin
Lock guarantee atomic access to the memory bus so lock instruction is necessary. You can probably also use API InterlockedCompareExchange (check MSDN for explanation). At first load the memory 32bit pointer as OldValue and than increment it to get NewValue, after that try to do InterlockedCompareExchange. The InterlockedCompareExchange(Destination + Offset, NewValue, OldValue) will return the compared value if not the same as OldValue than some other thread is exchange it, so no exchange was made and you must repeat the procedure.
GJ
+1  A: 

This is trivial on a UNIX system; just use the shared memory functions:

shgmet, shmat, shmctl, shmdt

void *shmat(int shmid, const void *shmaddr, int shmflg);

shmat() attaches the shared memory segment identified by shmid to the address space of the calling process. The attaching address is specified by shmaddr with one of the following criteria:

If shmaddr is NULL, the system chooses a suitable (unused) address at which to attach the segment.

Just specify your own address here; e.g. 0x20000000000

If you shmget() using the same key and size in every process, you will get the same shared memory segment. If you shmat() at the same address, the virtual addresses will be the same in all processes. The kernel doesn't care what address range you use, as long as it doesn't conflict with wherever it normally assigns things. (If you leave out the address, you can see the general region that it likes to put things; also, check addresses on the stack and returned from malloc() / new[] .)

On Linux, make sure root sets SHMMAX in /proc/sys/kernel/shmmax to a large enough number to accommodate your shared memory segments (default is 32MB).

As for atomic operations, you can get them all from the Linux kernel source, e.g.

include/asm-x86/atomic_64.h

/*
 * Make sure gcc doesn't try to be clever and move things around
 * on us. We need to use _exactly_ the address the user gave us,
 * not some alias that contains the same information.
 */
typedef struct {
        int counter;
} atomic_t;

/**
 * atomic_read - read atomic variable
 * @v: pointer of type atomic_t
 *
 * Atomically reads the value of @v.
 */
#define atomic_read(v)          ((v)->counter)

/**
 * atomic_set - set atomic variable
 * @v: pointer of type atomic_t
 * @i: required value
 *
 * Atomically sets the value of @v to @i.
 */
#define atomic_set(v, i)                (((v)->counter) = (i))


/**
 * atomic_add - add integer to atomic variable
 * @i: integer value to add
 * @v: pointer of type atomic_t
 *
 * Atomically adds @i to @v.
 */
static inline void atomic_add(int i, atomic_t *v)
{
        asm volatile(LOCK_PREFIX "addl %1,%0"
                     : "=m" (v->counter)
                     : "ir" (i), "m" (v->counter));
}

64-bit version:

typedef struct {
        long counter;
} atomic64_t;

/**
 * atomic64_add - add integer to atomic64 variable
 * @i: integer value to add
 * @v: pointer to type atomic64_t
 *
 * Atomically adds @i to @v.
 */
static inline void atomic64_add(long i, atomic64_t *v)
{
        asm volatile(LOCK_PREFIX "addq %1,%0"
                     : "=m" (v->counter)
                     : "er" (i), "m" (v->counter));
}
shm skywalker
+1  A: 

You shouldn't be afraid to make up an address at random, because the kernel will just reject addresses it doesn't like (ones that conflict). See my shmat() answer above, using 0x20000000000

With mmap:

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

If addr is not NULL, then the kernel takes it as a hint about where to place the mapping; on Linux, the mapping will be created at the next higher page boundary. The address of the new mapping is returned as the result of the call.

The flags argument determines whether updates to the mapping are visible to other processes mapping the same region, and whether updates are carried through to the underlying file. This behavior is determined by including exactly one of the following values in flags:

MAP_SHARED Share this mapping. Updates to the mapping are visible to other processes that map this file, and are carried through to the underlying file. The file may not actually be updated until msync(2) or munmap() is called.

ERRORS

EINVAL We don’t like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary).

shm skywalker
Interesting, I assumed you could only use it in when writing device drivers or other lower level hackery. It's a potentially attractive solution, but it would require having all the processes try mmap'ing different regions until they find one they can all agree on, and if a new process was started that didn't like the existing mapping all the old ones would potentially have to copy their data to a new location. Cool idea nonetheless, upvoted.
Joseph Garvin