views:

639

answers:

8

Duplicate: http://stackoverflow.com/questions/769097/choosing-a-stl-container

I'm looking for a data structure that acts like a set in that it doesn't allow duplicates to be inserted, but also knows the order in which the items were inserted. It would basically be a combination of a set and list/vector.

I would just use a list/vector and check for duplicates myself, but we need that duplicate verification to be fast as the size of the structure can get quite large.

+1  A: 

Writing your own class that wraps a vector and a set would seem the obvious solution - there is no C++ standard library container that does what you want.

anon
A: 

I'd just use two data structures, one for order and one for identity. (One could point into the other if you store values, depending on which operation you want the fastest)

Marcus Lindblom
A: 

Sounds like a job for an OrderedDictionary.

chaos
A: 

Duplicate verification that's fast seems to be the critical part here. I'd use some type of a map/dictionary maybe, and keep track of the insertion order yourself as the actual data. So the key is the "data" you're shoving in (which is then hashed, and you don't allow duplicate keys), and put in the current size of the map as the "data". Of course this only works if you don't have any deletions. If you need that, just have an external variable you increment on every insertion, and the relative order will tell you when things were inserted.

Not necessarily pretty, but not that hard to implement either.

Kevin
+4  A: 

Take a look at Boost.MultiIndex. You may have to write a wrapper over this.

dirkgently
Greg's solution in the duplicate question doesn't seem to require the wrapper. Thanks for pointing me in the most pain free direction, though.
Nick McCowin
+2  A: 

A Boost.Bimap with the insertion order as an index should work (e.g. boost::bimap < size_t, Foo > ). If you are removing objects from the data structure, you will need to track the next insertion order value separately.

equackenbush
A: 

Assuming that you're talking ANSI C++ here, I'd either write my own or use composition and delegation to wrap a map for data storage and a vector of the keys for order of insertion. Depending on the characteristics of the data, you might be able to use the insertion index as your map key and avoid using the vector.

A: 

Java has this in the form of an ordered set. I don't thing C++ has this, but it is not that difficult to implement yourself. What the Sun guys did with the Java class was to extend the hash table such that each item was simultaneously inserted into a hash table and kept in a double linked list. There is very little overhead in this, especially if you preallocate the items that are used to construct the linked list from.

If I where you, I would write a class that either used a private vector to store the items in or implement a hashtable in the class yourself. When any item is to be inserted into the set, check to see if it is in the hash table and optionally replace the item in there if such an item is in it. Then find the old item in the hash table, update the list to point to the new element and you are done.

To insert a new element you do the same, except you have to use a new element in the list - you can't reuse the old ones.

To delete an item, you reorder the list to point around it, and free the list element.

Note that it should be possible for you to get the part of the linked list where the element you are interested in is directly from the element so that you don't have to walk the chain each time you have to move or change an element.

If you anticipate having a lot of these items changed during the program run, you might want to keep a list of the list items, such that you can merely take the head of this list, rather than allocating memory each time you have to add a new element.

You might want to look at the dancing links algorithm.

tomjen