I am trying to come up with the best data structure for use in a high throughput C++ server. The data structure will be used to store anything from a few to several million objects, and no sorting is required (though a unique sort key can be provided very cheaply).
The requirements are that it can support efficient insert, ideally O(1), moderately efficient removal, and efficient traversal. It doesn't need to support a find operation (other than might be needed for removal).
The twist is that it must be thread safe with respect to modifications while other threads are enumerating the data structure. This means that a simple red-black tree doesn't work, since one thread can't insert an element (and perform the necessary tree rotations) without messing up any cursors held by other threads.
It is not acceptable to use a read/write lock and defer the write operations until all readers have finished, since read operations may be long lived. It doesn't matter if inserts that happen while there is a reader are visible to that reader or not.
Memory footprint is also very important, and small is obviously better!
What suggestions are there?
Response to comments:
Thanks for the answers.
No, inserts cannot invalidate existing iterators. Iterators may or may not see the new insert, but they must see everything that they would have seen if the insert hadn't occurred.
Deletion is required, however due to higher level rules I can guarantee that a iterator will never be stopped on an item that is available for deletion.
Locking per node for a cursor would have too great an impact on performance. There may be a number of threads reading at once, and any kind of memory hot spot that multiple threads are using in a lock kills memory bandwidth (as we discovered the hard way!). Even a simple count of readers with multiple threads calling InterlockedIncrement fails to scale cleanly.
I agree a linked list is likely the best approach. Deletes are rare, so paying the memory penalty for the back pointers to support O(1) delete is costly and we may compute those separately on demand and since deletes tend to be batch operations.
Fortunately insertion into a linked list doesn't require any locking for readers, as long as the pointers are updated in the inserted node before the head pointer is changed.
The lock-copy-unlock idea is interesting. The amount of data involved is too large for this to work as the default for readers, but it could be used for writers when they collide with readers. A read/write lock would protect the whole structure, and the write would clone the data structure if it collides with a reader. Writes are much rarer than reads.