I've been looking at the problem of writing a concurrent Multimap, and I have an implementation backed by the Google Guava AbstractSetMultimap and a MapMaker computing map that creates on demand the values-collections as a set view over a ConcurrentHashMap. With some care over the view collections and various wrappers, I think this gets pretty close.
The big problem, which has already been discussed by others who have tried this, appears to be that of removing the values-collections from the underlying map when they become empty, without introducing race conditions.
A couple of options appear to exist.
- leave the empty collections there. This will leak some CHMs, but I believe it to be at least correct.
- try optimistically to remove the collection when empty and compensate if anything else appears in it. This is full of races and seems intrinsically impossible to fix.
- synchronise everything on the values-collection, which would at least allow this removal, but at the cost of any concurrency after the initial lookup by key.
- for a smaller penalty (perhaps, depending on usage patterns?), perhaps synchronise on values-collection creation and removal, need to check whether that covers everything.
Questions:
- Does anyone know of any better implementation than this? Can we do better composing bits of MapMaker, or does this need a specialised ConcurrentHashMultimap written from scratch?
- If it's difficult to improve much on this, is this leak likely to be much of an issue in practice? Notable collections such as java.util.HashMap, juc.ConcurrentHashMap and ArrayDeque do not resize the backing store downwards, and ArrayList does not do so automatically. As long as we clear out the objects, I wonder whether this will matter too much.
Thanks
Edit: see also the discussion here on the guava mailing list.