ansaurus

Question

Answer 1

+1 A:

Loop wise this code is simpler, but not Linq:

public static Dictionary<string, string> Test()
{
    int initcount = _myDict.Sum(keyValuePair => keyValuePair.Value.Count);

    var usedValues = new Dictionary<string, string>(initcount); //reverse val/key
    var result = new Dictionary<string, string>(initcount);
    foreach (KeyValuePair<string, Dictionary<string, string>> internalDicts in _myDict)
    {
        foreach (KeyValuePair<string, string> valuePair in internalDicts.Value)
        {
            bool add = false;
            if (KeyInList(_myList, valuePair.Key))
            {
                string removeKey;
                if (usedValues.TryGetValue(valuePair.Value, out removeKey))
                {
                    if (KeyInList(_myList, removeKey)) continue;
                    result.Remove(removeKey);
                }
                usedValues.Remove(valuePair.Value);
                add = true;
            }
            if (!add && usedValues.ContainsKey(valuePair.Value)) continue;
            result[valuePair.Key] = valuePair.Value;
            usedValues[valuePair.Value] = valuePair.Key;
        }
    }
    return result;
}

private static bool KeyInList(List<string> myList, string subKey)
{
    string key = subKey.Substring(0, subKey.LastIndexOf('_'));
    return myList.Contains(key);
}

Mikael Svenson 2010-09-20 08:23:48

@ Mikael Svenson : This method is fine. But its taking more time to execute. I'm having large volume of data to process. It taking around 12.45 seconds while the other is taking only 0.016 sec.

Pramodh 2010-09-20 08:36:25

I'll fiddle with it some more and see if I can squeeze something else from my brain.

Mikael Svenson 2010-09-20 08:43:09

Just benchmarked on the small test data, and my version was about 5x faster than yours. Try to initialize the Dicts with expected number of items.

Mikael Svenson 2010-09-20 08:52:55

Edited my code to work with larger sets. Exchanged result.ContainsValue(...) with usedValues.ContainsKey(...). An obvious time save as ContainsValue is extremely slow O(n) compared to O(1) on ContainsKey.

Mikael Svenson 2010-09-20 10:04:03

If `_myList` could potentially contain many values then you *might* squeeze an extra bit of performance by populating a `HashSet<>` from it before you start looping: `HashSet<>.Contains` is O(1) whereas `List<>.Contains` is O(n).

LukeH 2010-09-20 10:49:42

Answer 2

+2 A:

Here's one way you could do it with LINQ and lambdas, as requested:

var keysFromList = new HashSet<string>(MyList);
var results =
    MyDict.Values
          .SelectMany(x => x)
          .OrderBy(x => {
                            int i = x.Key.LastIndexOf('_');
                            string k = (i < 0) ? x.Key.Trim() 
                                               : x.Key.Substring(0, i);
                            return keysFromList.Contains(k) ? 0 : 1;
                        })
          .Aggregate(new {
                             Results = new Dictionary<string, string>(),
                             Values = new HashSet<string>()
                         },
                     (a, x) => {
                                   if (!a.Results.ContainsKey(x.Key)
                                           && !a.Values.Contains(x.Value))
                                   {
                                       a.Results.Add(x.Key, x.Value);
                                       a.Values.Add(x.Value);
                                   }
                                   return a;
                               },
                     a => a.Results);

LukeH 2010-09-20 11:40:29

Nice solution :) Double the speed of my foreach, but most likely it won't matter since both are way faster than the original.

Mikael Svenson 2010-09-20 12:02:40

@Mikael: Really? I didn't benchmark but was expecting that your version would be faster, if anything, especially for larger sets of data. Yours looks like O(n) to me whereas using `OrderBy` would make mine roughly O(n log n). (I originally wrote something almost exactly the same as yours, then noticed that you'd already posted it so did the LINQ version instead!)

LukeH 2010-09-20 12:13:54

I meant, mine was quicker :) I can see now that I used the wrong words in that sentence.

Mikael Svenson 2010-09-20 12:51:26

ansaurus

tags:

views:

answers:

C# : Merging Dictionary and List

related questions