views:

130

answers:

2

i'have a List of String like

List<String> MyList=new List<String>{"A","B"};

and a

Dictionary<String, Dictionary<String,String>> MyDict=new Dictioanry<String,Dictionary<String,String>>(); 

which contains

 Key      Value
          Key     Value

   "ONE"        "A_1"  "1"
                "A_2"  "2"
                "X_1"  "3"
                "X_2"  "4"
                "B_1"  "5"

    "TWO"       "Y_1"  "1"
                "B_9"  "2"
                "A_4"  "3"
                "B_2"   "6"
                "X_3" "7"

I need to merge the the list and Dictionary into a new Dictionary

 Dictioanry<String,String>ResultDict = new Dictionary<String,String>()

The resulting dictionary contains

Key Value

"A_1"   "1"
"A_2"   "2"
"B_1"   "5"
"A_4"   "3"
"B_2"   "6"
"X_2"   "4"
"X_3"   "7"

Merge rule

  1. First add the items which has a substring equals to any item in the list.
  2. Then Merge the items in the "MyDict" so the result should not contain duplicate keys as well as duplicate values.

Here is my source code.

        Dictionary<String, String> ResultDict = new Dictionary<string, string>();
        List<String> TempList = new List<string>(MyDict.Keys);
        for (int i = 0; i < TempList.Count; i++)
        {
            ResultDict = ResultDict.Concat(MyDict[TempList[i]])
                                              .Where(TEMP => MyList.Contains(TEMP.Key.Contains('_') == true ? TEMP.Key.Substring(0, TEMP.Key.LastIndexOf('_'))
                                                                                                            : TEMP.Key.Trim()))
                                              .ToLookup(TEMP => TEMP.Key, TEMP => TEMP.Value)
                                              .ToDictionary(TEMP => TEMP.Key, TEMP => TEMP.First())
                                              .GroupBy(pair => pair.Value)
                                              .Select(group => group.First())
                                              .ToDictionary(pair => pair.Key, pair => pair.Value);            }
        for (int i = 0; i < TempList.Count; i++)
        {
            ResultDict = ResultDict.Concat(MyDict[TempList[i]])
                                              .ToLookup(TEMP => TEMP.Key, TEMP => TEMP.Value)
                                              .ToDictionary(TEMP => TEMP.Key, TEMP => TEMP.First())
                                              .GroupBy(pair => pair.Value)
                                              .Select(group => group.First())
                                              .ToDictionary(pair => pair.Key, pair => pair.Value);
        }

its working fine, but i need to eliminate the two for loops or atleast one (Any way to do this using LINQ or LAMBDA expression)

+1  A: 

Loop wise this code is simpler, but not Linq:

public static Dictionary<string, string> Test()
{
    int initcount = _myDict.Sum(keyValuePair => keyValuePair.Value.Count);

    var usedValues = new Dictionary<string, string>(initcount); //reverse val/key
    var result = new Dictionary<string, string>(initcount);
    foreach (KeyValuePair<string, Dictionary<string, string>> internalDicts in _myDict)
    {
        foreach (KeyValuePair<string, string> valuePair in internalDicts.Value)
        {
            bool add = false;
            if (KeyInList(_myList, valuePair.Key))
            {
                string removeKey;
                if (usedValues.TryGetValue(valuePair.Value, out removeKey))
                {
                    if (KeyInList(_myList, removeKey)) continue;
                    result.Remove(removeKey);
                }
                usedValues.Remove(valuePair.Value);
                add = true;
            }
            if (!add && usedValues.ContainsKey(valuePair.Value)) continue;
            result[valuePair.Key] = valuePair.Value;
            usedValues[valuePair.Value] = valuePair.Key;
        }
    }
    return result;
}

private static bool KeyInList(List<string> myList, string subKey)
{
    string key = subKey.Substring(0, subKey.LastIndexOf('_'));
    return myList.Contains(key);
}
Mikael Svenson
@ Mikael Svenson : This method is fine. But its taking more time to execute. I'm having large volume of data to process. It taking around 12.45 seconds while the other is taking only 0.016 sec.
Pramodh
I'll fiddle with it some more and see if I can squeeze something else from my brain.
Mikael Svenson
Just benchmarked on the small test data, and my version was about 5x faster than yours. Try to initialize the Dicts with expected number of items.
Mikael Svenson
Edited my code to work with larger sets. Exchanged result.ContainsValue(...) with usedValues.ContainsKey(...). An obvious time save as ContainsValue is extremely slow O(n) compared to O(1) on ContainsKey.
Mikael Svenson
If `_myList` could potentially contain many values then you *might* squeeze an extra bit of performance by populating a `HashSet<>` from it before you start looping: `HashSet<>.Contains` is O(1) whereas `List<>.Contains` is O(n).
LukeH
+2  A: 

Here's one way you could do it with LINQ and lambdas, as requested:

var keysFromList = new HashSet<string>(MyList);
var results =
    MyDict.Values
          .SelectMany(x => x)
          .OrderBy(x => {
                            int i = x.Key.LastIndexOf('_');
                            string k = (i < 0) ? x.Key.Trim() 
                                               : x.Key.Substring(0, i);
                            return keysFromList.Contains(k) ? 0 : 1;
                        })
          .Aggregate(new {
                             Results = new Dictionary<string, string>(),
                             Values = new HashSet<string>()
                         },
                     (a, x) => {
                                   if (!a.Results.ContainsKey(x.Key)
                                           && !a.Values.Contains(x.Value))
                                   {
                                       a.Results.Add(x.Key, x.Value);
                                       a.Values.Add(x.Value);
                                   }
                                   return a;
                               },
                     a => a.Results);
LukeH
Nice solution :) Double the speed of my foreach, but most likely it won't matter since both are way faster than the original.
Mikael Svenson
@Mikael: Really? I didn't benchmark but was expecting that your version would be faster, if anything, especially for larger sets of data. Yours looks like O(n) to me whereas using `OrderBy` would make mine roughly O(n log n). (I originally wrote something almost exactly the same as yours, then noticed that you'd already posted it so did the LINQ version instead!)
LukeH
I meant, mine was quicker :) I can see now that I used the wrong words in that sentence.
Mikael Svenson