views:

71

answers:

2

I need a data-structure, which supports the following operations both memory and time-efficient, it can be assumed, that the value has an ordering.

  • Add a value to the structure
  • Find out, whether a value is in the structure

Plus, the structure has to be immutable, because I want to use Haskell.

If I would not assume immutability, probably a bloom filter is my choice.

I'm coding on my optimization problem and because I can't be shure, whether an entry was already processed, I have to lookup.

+4  A: 

The data structure usually used in cases where you need to check membership often is Data.Set, which is a tree-based set that offers lookup and insert operations in O(log n) time.

However since you mentioned bloom filters: There are Bloom Filter implementations for Haskell. So in a situation where you would choose bloom filters in other languages, you can still do so in Haskell.

sepp2k
Note that the bloom filter library lets you construct immutable bloom filters in one go, but it needs to be in ST to allow you to incrementally add to them.
sclv
+4  A: 

Data.Set is indeed the most straightforward choice, but if you can project your datastructure to an Int, then you can use an IntSet to get more efficiency than Data.Set. If your projection is lossy (which is to say that it is really a hash), then a hashtable using an underlying IntSet (i.e. a HashSet) would often be more efficient. Precisely such a package exists on Hackage, and has been benchmarked as pretty darn good: http://hackage.haskell.org/package/hashmap.

Finally, if you need a membership check, but not extraction, and you really care about using minimal space, then you could project your datastructure to an Integer (assuming that yields space savings, which really depends...) and then use a HashSet of those.

sclv