views:

196

answers:

5

Assume you have a reference counted object in shared memory. The reference count represents the number of processes using the object, and processes are responsible for incrementing and decrementing the count via atomic instructions, so the reference count itself is in shared memory as well (it could be a field of the object, or the object could contain a pointer to the count, I'm open to suggestions if they assist with solving this problem). Occasionally, a process will have a bug that prevents it from decrementing the count. How do you make it as easy as possible to figure out which process is not decrementing the count?

One solution I've thought of is giving each process a UID (maybe their PID). Then when processes decrement, they push their UID onto a linked list stored alongside the reference count (I chose a linked list because you can atomically append to head with CAS). When you want to debug, you have a special process that looks at the linked lists of the objects still alive in shared memory, and whichever apps' UIDs are not in the list are the ones that have yet to decrement the count.

The disadvantage to this solution is that it has O(N) memory usage where N is the number of processes. If the number of processes using the shared memory area is large, and you have a large number of objects, this quickly becomes very expensive. I suspect there might be a halfway solution where with partial fixed size information you could assist debugging by somehow being able to narrow down the list of possible processes even if you couldn't pinpoint a single one. Or if you could just detect which process hasn't decremented when only a single process hasn't (i.e. unable to handle detection of 2 or more processes failing to decrement the count) that would probably still be a big help.

(There are more 'human' solutions to this problem, like making sure all applications use the same library to access the shared memory region, but if the shared area is treated as a binary interface and not all processes are going to be applications written by you that's out of your control. Also, even if all apps use the same library, one app might have a bug outside the library corrupting memory in such a way that it's prevented from decrementing the count. Yes I'm using an unsafe language like C/C++ ;)

Edit: In single process situations, you will have control, so you can use RAII (in C++).

+1  A: 

Fundementally, shared memory shared state is not a robust solution and I don't know of a way of making it robust.

Ultimately, if a process exits all its non-shared resources are cleaned up by the operating system. This is incidentally the big win from using processes (fork()) instead of threads.

However, shared resources are not. File handles that others have open are obviously not closed, and ... shared memory. Shared resources are only closed after the last process sharing them exits.

Imagine you have a list of PIDs in the shared memory. A process could scan this list looking for zombies, but then PIDs can get reused, or the app might have hung rather than crashed, or...

My recommendation is that you use pipes or other message passing primitives between each process (sometimes there is a natural master-slave relationship, other times all need to talk to all). Then you take advantage of the operating system closing these connections when a process dies, and so your peers get signalled in that event. Additionally you can use ping/pong timeout messages to determine if a peer has hung.

If, after profiling, it is too inefficient to send the actual data in these messages, you could use shared memory for the payload as long as you keep the control channel over some kind of stream that the operating system clears up.

Will
+1 for the warning against premature optimisation (the answer to a lot of questions on SO!)
Daniel Earwicker
+1  A: 

The most efficient tracing systems for resource ownership don't even use reference counts, let alone lists of reference-holders. They just have static information about the layouts of every data type that might exist in memory, also the shape of the stack frame for every function, and every object has a type indicator. So a debugging tool can scan the stack of every thread, and follow references to objects recursively until it has a map of all the objects in memory and how they refer to each other. But of course systems that have this capability also have automatic garbage collection anyway. They need help from the compiler to gain all that information about the layout of objects and stack frames, and such information cannot actually be reliably obtained from C/C++ in all cases (because object references can be stored in unions, etc.) On the plus side, they perform way better than reference counting at runtime.

Per your question, in the "degenerate" case, all (or almost all) of your process's state would be held in shared memory - apart from local variables on the stack. And at that point you would have the exact equivalent of a multi-threaded program in a single process. Or to put it another way, processes that share enough memory start to become indistinguishable from threads.

This implies that you needn't specify the "multiple processes, shared memory" part of your question. You face the same problem anyone faces when they try to use reference counting. Those who use threads (or make unrestrained use of shared memory; same thing) face another set of problems. Put the two together and you have a world of pain.

In general terms, it's good advice not to share mutable objects between threads, where possible. An object with a reference count is mutable, because the count can be modified. In other words, you are sharing mutable objects between (effective) threads.

I'd say that if your use of shared memory is complex enough to need something akin to GC, then you've almost got the worst of both worlds: the expensive of process creation without the advantages of process isolation. You've written (in effect) a multi-threaded application in which you are sharing mutable objects between threads.

Local sockets are a very cross-platform and very fast API for interprocess communication; the only one that works basically identically on all Unices and Windows. So consider using that as a minimal communication channel.

By the way, are you using consistently using smart pointers in the processes that hold references? That's your only hope of getting reference counting even half right.

Daniel Earwicker
+6  A: 

You could do this using only a single extra integer per object.

Initialise the integer to zero. When a process increments the reference count for the object, it XORs its PID into the integer:

object.tracker ^= self.pid;

When a process decrements the reference count, it does the same.

If the reference count is ever left at 1, then the tracker integer will be equal to the PID of the process that incremented it but didn't decrement it.


This works because XOR is commutative ( (A ^ B) ^ C == A ^ (B ^ C) ), so if a process XORs the tracker with its own PID an even number of times, it's the same as XORing it with PID ^ PID - that's zero, which leaves the tracker value unaffected.

You could alternatively use an unsigned value (which is defined to wrap rather than overflow) - adding the PID when incrementing the usage count and subtracting it when decrementing it.

caf
Hah, I had the same thought then convinced myself I was wrong with an incorrect mental math counterexample. If you update to explain why this works (xor is obviously commutative and X ^ X undoes itself) I'll mark as accepted :)
Joseph Garvin
I'm usually skeptical of bit manipulation hacks, but this one is nifty. Nice!
Kena
A: 

Use following

int pids[MAX_PROCS]
int counter;

Increment

do
   find i such pid[i]=0  // optimistic
while(cas[pids[i],0,mypid)==false)
my_pos = i;
atomic_inc(counter)

Decrement

pids[my_pos]=0
atomic_dec(counter);

So you know all processes using this object

You MAX_PROCS big enough and search free place randomly so if number of processes significanly lower then MAX_PROCS the search would be very fast.

Artyom
A: 

Next to doing things yourself: you can also use some tool like AQTime which has a reference counted memchecker.

Ritsaert Hornstra