How to implement a set ?

views:

233

answers:

+3 Q:

How to implement a set ?

I want to implement a Set in C. Is it OK to use a linked list, when creating the SET, or should I use another approach ?

How do you usually implement your own set (if needed).

NOTE: If I use the Linked List approach, I will probably have the following complexities for Set my operations:

init : O(1);
destroy: O(n);
insert: O(n);
remove: O(n);
union: O(n*m);
intersection: O(n*m);
difference: O(n*m);
ismember: O(n);
issubset: O(n*m);
setisequal: O(n*m);

O(n*m) seems may be a little to big especially for huge data... Is there a way to implement my Set more efficient ?

+4 A:

std::set is often implemented as a red black tree: http://en.wikipedia.org/wiki/Red-black_tree

This approach will give you much better complexity on all the listed operations.

Andreas Brinck 2010-03-29 12:11:27

+2 A:

There are lot of ways for set implementation. Here are some of them. Besides MSDN have very good article on it.

malay 2010-03-29 12:17:22

Thanks for mentioning the MSDN article, very intersting article.

Andrei Ciobanu 2010-03-29 12:55:01

+1 A:

I have used Red-Black trees in the past to build sets.

edit: A little slow on the response.

Here are the time complexities from the Wikipedia article.

Space O(n)
Search O(log n)
Insert O(log n)
Delete O(log n)

Seth M. 2010-03-29 12:18:31

Can you give me some hints regarding O(n) complexity ?

Andrei Ciobanu 2010-03-29 12:22:12

posted in the edit

Seth M. 2010-03-29 12:23:03

+1 A:

Since you already have a linked list implemented, the easiest is a skip list. If you want to use balanced trees, the easiest in my opinion is a treap. These are randomized data structures, but generally they are just as efficient as their deterministic counterparts, if not more (and a skip list can be made deterministic).

IVlad 2010-03-29 12:25:19

Thanks for for mentioning the skip list (didn't know about it). I will probably use it in another context. (Multumesc!)

Andrei Ciobanu 2010-03-29 12:53:55

+5 A:

Sets are typically implemented either as red-black trees (which requires the elements to have a total order), or as an automatically-resizing hashtable (which requires a hash function).

The latter is typically implemented by having the hashtable double in size and reinserting all elements when a certain capacity threshold (75% works well) is exceeded. This means that inidividual insert operations can be O(n), but when amortized over many operations, it's actually O(1).

Michael Borgwardt 2010-03-29 12:25:49

ansaurus

tags:

views:

answers:

How to implement a set ?

related questions