tags:

views:

295

answers:

3

I'm using linq on a DataTable (in c#), and was wondering how to group by multiple fields. I discovered that it can be done with an anonymous class, e.g.

var a = dt.AsEnumerable().GroupBy(e => new { name = e["Name"] })

problem is, my grouping key is dynamically determined at runtime. so I instead tried grouping by a Dictionary instead:

var a = dt.AsEnumerable().GroupBy(e => GetKey(e))

where GetKey(e) returns a Dictionary<string, object>. the idea is that the dictionary values sort of take the place of the anonymous class keys and values. my problem is that the linq query no longer works as desired - it doesn't seem to do any grouping at all. my hunch is that it's because internally it has to compare the grouping key for each DataTable row, and the dictionary keys aren't considered equal just because they have the same keys and values, so each row has a different grouping key and are thus not aggregated.

if I'm right on that, what's the proper way to address this? I tried wrapping the dictionary in a class, and overriding the Equals() method, but it was never called.

+1  A: 

Why not just have GetKey() return the key as a string?

var a = dt.AsEnumerable().GroupBy(e => new { name = e[GetKey(e)] });

You can create the key from the values in specified columns, and make it into one string to group on:

var keyDictionary = new Dictionary<string, IEnumerable<string>>();
keyDictionary.Add("Table1", new List<string> {"Group", "Position"});

var dt = new DataTable("Table1");
dt.Columns.AddRange(new [] { new DataColumn("Id", typeof(int)), new DataColumn("Group", typeof(string)), new DataColumn("Position", typeof(string)), new DataColumn("Name", typeof(string))});
var rowItemArrays = new [] { new object[] { 1, "Alpha", "Left", "Bob" }, new object[] { 2, "Alpha", "Right", "Mary"}, new object[] { 3, "Beta", "Right", "Bill"}, new object[] { 4, "Alpha", "Right", "Larry"}};
rowItemArrays.ToList().ForEach(i => dt.Rows.Add(i));

Func<DataRow, string> GetKeys = (dataRow) => string.Join("", keyDictionary[dataRow.Table.TableName].Select(key => dataRow[key].ToString()).ToArray());

var a = dt.AsEnumerable().GroupBy(GetKeys);

You'd have to watch out for null values etc....

Richard Hein
thought of that, I didn't go with it because I'm making use of the dictionary data when I iterate through the linq query results. I suppose I could still just append all the values into a single string and parse it back to key/value pairs.
toasteroven
Ok, I've added something that works for me.
Richard Hein
+1  A: 

This is cribbed from the help files and something I haven't implemented, but should work. The problem is that you need a single class for it to compare and it uses both ToString and GetHashCode in the comparison (which is why your dictionary idea didn't work, it isn't comparing the elements of the dictionary, it's comparing the ToString and GetHashCode of it). Have GetKey return the following class and populate the keyBag of the class with your Dictionary from above:

class PortableKey
{
    public Dictionary<string, object> keyBag { get; set; }

    public PortableKey(Dictionary<string, object> Keys)
    {
        this.keyBag = Keys;
    }

    public override bool Equals(object obj)
    {
        PortableKey other = (PortableKey)obj;
        foreach (KeyValuePair<string, object> key in keyBag)
        {
            if (other.keyBag[key.Key] != key.Value) return false;
        }
        return true;
    }

    public override int GetHashCode()
    {
        // hashCodes is an array of integers represented as strings. { "1", "4", etc. }
        string[] hashCodes = keyBag.Select(k => k.Value.GetHashCode().ToString()).ToArray();
        // hash is the Hash Codes all joined in a single string. "1,4,etc."
        string hash = string.Join(",", hashCodes);
        // returns a single hash code for the combined hash. 
        // Note, this is not guaranteed unique, nor is it intended to be so.
        return hash.GetHashCode();
    }
    public override string ToString()
    {
        string[] values = keyBag.Select(k => k.Value.ToString()).ToArray();
        return string.Join(",", values);
    }
}
Jacob Proffitt
A: 
var keyDictionary = new Dictionary<string, IEnumerable<string>>();
keyDictionary.Add("Table1", new List<string> {"Group", "Position"});

var dt = new DataTable("Table1");
dt.Columns.AddRange(new [] { new DataColumn("Id", typeof(int)), new DataColumn("Group", typeof(string)), new DataColumn("Position", typeof(string)), new DataColumn("Name", typeof(string))});
var rowItemArrays = new [] { new object[] { 1, "Alpha", "Left", "Bob" }, new object[] { 2, "Alpha", "Right", "Mary"}, new object[] { 3, "Beta", "Right", "Bill"}, new object[] { 4, "Alpha", "Right", "Larry"}};
rowItemArrays.ToList().ForEach(i => dt.Rows.Add(i));

Func<DataRow, string> GetKeys = (dataRow) => string.Join("", keyDictionary[dataRow.Table.TableName].Select(key => dataRow[key].ToString()).ToArray());

var a = dt.AsEnumerable().GroupBy(GetKeys);

This is best logic you can try my friend, we have lot of studies regarding this, so what answer i have written is the logic given by my professor