First, I'd check that C++'s multimap actually does give O(1) for removing the min element. The most common structures for sotred maps/priority queues are tree structures in which querying the min element has O(1) complexity, but removing it is O(log n).
That said, I think that a skip list (as implemented by Java's ConcurrentSkipListMap) might give you O(1) for removal of the minimum element, but I'm not absolutely sure. One of the problems when assessing performance of ConcurrentSkipListMap is that traversal operations "tidy up" markers left by previous deletions. So the complexity of an operation can actually depend on what the previous operations were. (On the other hand, at some level, that's true of pretty much any algorithm: whether some data is in the CPU cache can depend on whether a previous operation put it there...)
P.S. Forgot to say: look at ConcurrentSkipListMap.pollFirstEntry().