views:

891

answers:

4

I have a list of ~9000 products, and some of which may have duplicates.

I wanted to make a HashTable of these products with the products serial number as their key so I can find duplicates easily.

How would one go about using a HashTable in C#/.NET? Would a HashSet be more appropriate?

Eventually I would like a list like:

Key-Serial: 11110 - Contains: Product1
Key-Serial: 11111 - Contains: Product3, Product6, Product7
Key-Serial: 11112 - Contains: Product4
Key-Serial: 11113 - Contains: Product8, Product9

So, I have a list of all products, and they are grouped by the ones that have duplicate serial numbers. What is the "correct" way to do this?

+1  A: 

First you need to define your 'Primary Key' as it were, a set of fields that are unique to each object. I guess Key-Serial would be part of that set, but there must be others. Once you define that 'Primary Key' you can define a struct that represents a Key Value and use that as the key to a dictionary containing your products.

Example:

struct ProductPrimaryKey
{
    public string KeySerial;
    public string OtherDiscriminator;

    public ProductPrimaryKey(string keySerial, string otherDiscriminator)
    {
        KeySerial = keySerial;
        OtherDiscriminator = otherDiscriminator;
    }
}

class Product
{
    public string KeySerial { get; set; }
    public string OtherDiscriminator { get; set; }
    public int MoreData { get; set; }
}

class DataLayer
{
    public Dictionary<ProductPrimaryKey, Product> DataSet 
        = new Dictionary<ProductPrimaryKey, Product>();

    public Product GetProduct(string keySerial, string otherDiscriminator)
    {
        return DataSet[new ProductPrimaryKey(keySerial, otherDiscriminator)];
    }
}
Aviad P.
+1  A: 

I think Dictionary is the recommended class for stuff like this.

it would be something like this in your case

Dictionary<string, List<Product>>

(using serial string as key)

peter p
That is a kludge, how could you choose the right product from the list? There's no substitute for a unique key.
Aviad P.
Why is this a kludge? The question was about grouping products by serial. This is a straightforward, simple and readable answer which meets the requirements, no?
peter p
+3  A: 

A generic Dictionary would suite this best, I think. Code might look something like this:

var keyedProducts = new Dictionary<int,List<string>>();

foreach (var keyProductPair in keyProductPairs)
{
  if (keyedProducts.Contains(keyProductPair.Key))
    keyedProducts[keyProductPair.Key].Add(keyProductPair.Product);
  else
    keyedProducts.Add(keyProductPair.Key, new List<string>(new[]{keyProductPair.Product}));
}
James Kolpack
+1  A: 

A hashtable is a kind of dictionary, and a hashset is a kind of set. Neither dictionaries nor sets directly solve your problem - you need a data structure which holds multiple objects for one key.

Such databases are often called multimaps. You can create one by simply using a hashtable where the type of keys are integers and the types of values are sets of some kind (for example, hashsets...).

Alternatively, you can look at existing multimap solutions, such as here: http://stackoverflow.com/questions/380595/multimap-in-c-3-0.

For information on using hashtables, you can check it out on MSDN: http://msdn.microsoft.com/en-us/library/system.collections.hashtable.aspx, and there are plenty of other tutorials - search on using either "HashTable" or "Dictionary".

Oak