views:

71

answers:

2

I am trying to remove elements from a Dictionary<string, List<string>> in C# when the count of the list<string> is lesser than or equal to 1. I got some code working but it is not elegant and I have a gut feeling that this can be done elegantly in linq.

This is the code I have now

        Dictionary<string,List<string>> FindAnagrams(List<string> dictionary)
        {
            Dictionary<string, List<string>> anagrams = new Dictionary<string, List<string>>();
            foreach (string word in dictionary)
            {
                char[] charArray=word.ToCharArray();
                Array.Sort(charArray);
                string sorted=new string(charArray);
                if (anagrams.ContainsKey(sorted))
                    anagrams[sorted].Add(word);
                else
                    anagrams.Add(sorted, new List<string>() { word });
            }
            List<string> nonAnagrams = new List<string>();
            foreach (var sorted in anagrams.Keys)
                if (anagrams[sorted].Count == 1)
                    nonAnagrams.Add(sorted);
            foreach(string word in nonAnagrams)
                anagrams.Remove(word);               
            return anagrams;
        }

Below is how far I got using linq but this ain't working.

var realAna = from keys in anagrams.Keys
              where anagrams[keys].Count >1
              select anagrams.values;

To put the problem in context I am trying to find anagrams from a dictionary, I consider a words as having anagrams if the sorted key has more than one value associated with it.

+2  A: 

You can indeed do this with LINQ:

Dictionary<string, List<string>> FindAnagrams(List<string> dictionary)
{
    return dictionary
        .GroupBy(w => new string(((IEnumerable<char>)w).OrderBy(c => c).ToArray()))
        .Where(g => g.Count() > 1)
        .ToDictionary(g => g.Key, g => g.ToList());
}

How it works:

  • Group the words by their letters rearranged in sorted order.
  • Select only the groups which have at least two words.
  • Convert the result to a dictionary.
Mark Byers
This is like the most awesomest answer for this question. Just one statement, are you from perl world? I had this question in an interview recently, I wonder what the interviewer would have said if he sees this solution :)
satyajit
@satyajit: An improvement I would suggest to the interviewer: Create a method `SortLetters` that takes a string and returns a new string with the letters in sorted order. Then that first line which is by far the most complicated will become just `.GroupBy(word => SortLetters(word))`. Then I think that it is a quite readable solution.
Mark Byers
+2  A: 
var anagrams = new Dictionary<string, IList<string>>()
{
 {"hello", new List<string>(){"hello", "helol", "hlelo"}},
 {"hi", new List<string>(){"hi"}},
 {"me", new List<string>(){"me", "em"}}
};

var a2 = anagrams
 .Where(x => x.Value.Count > 1)
 .Aggregate(new Dictionary<string, IList<string>>(),
  (acc, item) => { acc.Add(item.Key, item.Value); return acc; });

This uses non-query form linq, and is built up programatically.

  • The Where Selects all key/value pairs in the dictionary where the list has more than one item.
  • The Select I removed because it's actually not needed anymore. :)
  • The Aggregate collects the pairs and performs an add for each item (adding it into the list). You could also use .ToDictionary(...) here.

If you need to sort your sub-lists change item.Value to item.Value.Sort(s => s).ToList()

Aren
I kind of understood your solution, so aggregate taking the values in anagrams with more than one element in list and creating a new dictionary?
satyajit
Aggregate performs an operation on a list, where `acc` in this example is the accumulator (the dictionary), and `item` is the item being iterated. I'm just using it to collect the anonymous types into a dictionary. I was unaware of the `.ToDictionary` method until Mark posted his answer. I'll update my answer a bit to break out the points of the linq query.
Aren