views:

1921

answers:

9

I'm working on a something related to roughset right now. The project uses alot of sets operation and manipulation. I've been using string operations as a stop gap measure for set operation. It has worked fine until we need to process some ungodly amount of data ( 500,000 records with about 40+ columns each ) through the algorithm.

I know that there is no set data structure in .net 2.0(2.0 was the latest when I started the project) I want to know if there is any library that offer fast set operation in .net c# or if 3.5 has added native set data structure.

Thanks .

A: 

You can use Linq to Objects in C# 3.0.

Mike Thompson
+4  A: 

LINQ supports some set operations. See LINQ 101 page for examples.
Also there is a class HashSet (.NET 3.5)


Here is Microsoft guidelines for set operations in .NET:

HashSet and LINQ Set Operations

List of set operations supported by HasSet class:

HashSet Collection Type

aku
A: 

You ever think about sing F#? This seems like a job for a functional programming language.

Charles Graham
+2  A: 

Update: This is for .Net 2.0. For .Net 3.5, refer posts by aku, Jon..

This is a good reference for efficiently representing sets in .Net.

Gulzar
A: 

Try HashSet in .NET 3.5.

This page from a member of the .NET BCL team has some good information on the intent of HashSet

Ash
+10  A: 

.NET 3.5 already has a native set data type: HashSet. You might also want to look at HashSet and LINQ set operators for the operations.

In .NET 1.0, there was a third party Set data type: Iesi.Collections which was extended with .NET 2.0 generics with Iesi.Collections.Generic.

You might want to try and look at all of them to see which one would benefit you the most. :)

Jon Limjap
+2  A: 

It may be worth taking a look at C5, it's a generic collection library for .NET which includes sets.

Note that I haven't looked into it much, but it seems to be a pretty fantastic collection library.

Nathan Tomkins
A: 

You should take a look at C5 Generic Collection Library. This library is a systematic approach to fix holes in .NET class library by providing missing structures, as well as replacing existing ones with set of well designed interfaces and generic classes.

Among others, there is HashSet<T> - generic Set class based on linear hashing.

Dejan Milicic
+1  A: 

I have been abusing the Dictionary class in .NET 2.0 as a set:

private object dummy = "ok";

public void Add(object el) {
  dict[el] = dummy;
}

public bool Contains(object el) {
  return dict.ContainsKey(el);
}
Martijn