views:

279

answers:

3

How to convert

Dictioanry<String,List<String>> into Dictionary<String,String>

i'm having a dictionary like

Dictioanry<String,List<String>>dictOne=new Dictionary<String,List<String>>();

and which containg

Key(String)          Value(List<String>)

     A                a1,a2
     B                b1,b2
     C                c1

i need to convert the "dictOne" into

 Dictionary<String,String> dictReverse=new Dictionary<String,String>()

So the result will be like

Key(String)         Value(String)

   a1                  A
   a2                  A
   b1                   B
   b2                  B
   c1                   C

is there any way to do this using LINQ

Thanks in advance

+8  A: 
// Associates each key with each of its values. Produces a sequence like:
// {A, a1}, {A, a2}, {B, b1}, {B, b2}, {C, c1}            
var kvps = from kvp in dictOne
           from value in kvp.Value
           select new { Key = kvp.Key, Value = value };    

// Turns the sequence into a dictionary, with the old 'Value' as the new 'Key'
var dictReverse = kvps.ToDictionary(kvp => kvp.Value, kvp => kvp.Key);

Of course, each key in the original dictionary must be associated with a unique set of values, and no key must be associated with values that are also associated with other keys.

Also bear in mind that Dictionary<K, V> does not define any sort of enumeration order. You can use the Enumerable.OrderBy method to enumerate the resulting dictionary in the appropriate order.

Ani
+1 for the valuable additional info you've pointed out, and for actually using LINQ query syntax, unlike me (don't know why I never write it that way -- it's clearly more readable).
Dan Tao
Why would you need to sort? You're going from an unordered set to another unordered set, so sorting would be pointless.
Gabe
@Dan Tao: Thanks. I'm much more comfortable with the extension method syntax; I've been forcing myself to start writing query syntax to try to get used to it.
Ani
@Gabe: Any sorting would have to come *after* the creation of the second dictionary. e.g. `dictReverse.OrderBy(kvp => kvp.Value)`. Of course, it does not alter the unordered nature of the underlying collection.
Ani
@Gabe: While I agree in principle, the OP's example dictionary did *happen* to be in alphabetical order by key, and if he added the elements in that order and expected to be able to enumerate over them in the same order he *may* have gotten away with it. So I think Ani's right to point out that if you *want* your data to be in a certain order, you should be sure to use some method such as `OrderBy` to ensure that it will be.
Dan Tao
Presumably he would use a `SortedDictionary` if he cared about the order of the keys, no?
Gabe
@Gabe: Well, I guess so, but... (1) We shouldn't presume that those asking questions know everything that we know. If they did, they wouldn't be asking the question, right? (2) I think Ani's making the point that, since clearly a `List<T>` is in a defined order, you *might* want a `Dictionary<T, List<T>>` to be reversible to a `Dictionary<T, T>` in the order defined by the `List<T>` values; and (3) quite simply, enumerating in that order is most straightforward using `OrderBy` (since there is no `ToSortedDictionary`).
Dan Tao
+10  A: 

Update: As others have noted, in order for a dictionary to be truly "reversible" in this way, the values in your List<string> objects need to all be unique; otherwise, you cannot create a Dictionary<string, string> with an entry for every value in your source dictionary, as there would be duplicate keys.

Example:

var dictOne = new Dictionary<string, List<string>>
{
    { "A", new List<string> { "a1", "a2" } },
    { "B", new List<string> { "b1", "b2" } },
    { "C", new List<string> { "c1", "a2" } } // duplicate!
};

You have (at least) two options for dealing with this.

Option 1: Throw on duplicates

You may want to ensure that every element in every List<string> is, in fact, unique. In this case, a simple SelectMany with a ToDictionary will accomplish what you need; the ToDictionary call will throw an ArgumentException on encountering a duplicate value:

var dictTwo = dictOne
    .SelectMany(kvp => kvp.Value.Select(s => new { Key = s, Value = kvp.Key }))
    .ToDictionary(x => x.Key, x => x.Value);

The most generic way (that comes to mind) to abstract this functionality into its own method would be to implement an extension method that does this for any IDictionary<T, TEnumerable> implementation where TEnumerable implements IEnumerable<TValue>:

// Code uglified to fit within horizonal scroll area
public static Dictionary<T2, T1> ReverseDictionary<T1, T2, TEnumerable>(
    this IDictionary<T1, TEnumerable> source) where TEnumerable : IEnumerable<T2>
{
    return source
        .SelectMany(e => e.Value.Select(s => new { Key = s, Value = e.Key }))
        .ToDictionary(x => x.Key, x => x.Value);
}

The ugly proliferation of generic type parameters in the above method is to allow for types other than strictly Dictionary<T, List<T>>: it could accept a Dictionary<int, string[]>, for example, or a SortedList<string, Queue<DateTime>> -- just a couple of arbitrary examples to demonstrate its flexibility.

(A test program illustrating this method is at the bottom of this answer.)

Option 2: Skip duplicates

If duplicate elements in your List<string> values is a realistic scenario that you want to be able to handle without throwing an exception, I suggest you take a look at Gabe's excellent answer for an approach that uses GroupBy (actually, Gabe also provides a flexible approach that can cover either of these two cases based on a selector function; however, if you definitely want to throw on a duplicate, I'd still suggest the above approach, as it should be somewhat cheaper than using GroupBy).

Example program

Here's a little test program illustrating Option 1 above on a Dictionary<string, List<string>> with no duplicate elements in its List<string> values:

var dictOne = new Dictionary<string, List<string>>
{
    { "A", new List<string> { "a1", "a2" } },
    { "B", new List<string> { "b1", "b2" } },
    { "C", new List<string> { "c1" } }
};

// Using ReverseDictionary implementation described above:
var dictTwo = dictOne.ReverseDictionary<string, string, List<string>>();

foreach (var entry in dictTwo)
{
    Console.WriteLine("{0}: {1}", entry.Key, entry.Value);
}

Output:

a1: A
a2: A
b1: B
b2: B
c1: C
Dan Tao
+1 - just finished writing the exact same code!
Will
@Will: Nice, great minds think alike ;)
Dan Tao
+1 I also prefer this one line code using `SelectMany` instead of two `from`.
Danny Chen
It should be noted, perhaps, that if a value occurs twice in any of the lists (which is permitted), this code will throw. Of course, this may be exactly what you want, but it should nonetheless be stated.
Timwi
@Timwi: True -- that's a good point that Gabe made in his answer (with an alternative approach). I've updated my answer and linked to his for completeness.
Dan Tao
+5  A: 

In the event that you would end up with duplicate keys in your result dictionary, you would have to pick a single one of those keys. Here's an implementation that just picks the first one it sees (using First):

var dictReverse = (from kvp in dictOne
                   from value in kvp.Value
                   group kvp.Key by value)
                   .ToDictionary(grp => grp.Key, grp => grp.First());

Given this input dictionary:

var dictOne = new Dictionary<string, IEnumerable<string>> { 
    { "C", new List<string> { "c1", "a2" } },
    { "B", new List<string> { "b1", "b2" } },
    { "A", new List<string> { "a1", "a2" } } };

The result would be:

c1: C
a2: C
b1: B
b2: B
a1: A

As Dan points out, you may want different behavior in the case of duplicate keys. You can create this function:

public static Dictionary<V, K> Transpose<K, V>(
    this Dictionary<K, IEnumerable<V>> dictOne,
    Func<IEnumerable<K>, K> selector)
{
    return (from kvp in dictOne
            from V value in kvp.Value
            group kvp.Key by value)
                .ToDictionary(grp => grp.Key, grp => selector(grp));
}

Then you could call it like dictOne.Transpose(Enumerable.First) to get the above behavior, dictOne.Transpose(Enumerable.Single) to get an exception when there's a duplicate key (the behavior of other posts), dictOne.Transpose(Enumerable.Min) to pick the first one lexicographically, or pass in your own function do whatever you need.

Gabe
+1: A good idea, assuming the OP wants to be able to skip duplicate keys. If he wants to ensure the uniqueness of the items in each `List<string>` (unclear from the question), he may *want* `ToDictionary` to throw an exception in the event of a duplicate.
Dan Tao
Dan: good point. See my edit.
Gabe
@Gabe: I like the idea, although unfortunately I don't think taking a `Dictionary<K, IEnumerable<V>>` as an argument is quite as flexible as you would want: a `Dictionary<T, List<T>>` (as in the OP's question) wouldn't work, for example. Of course the simple solution is to take an `IDictionary<K, TEnumerable>` with a `where TEnumerable : IEnumerable<V>` constraint (see my update).
Dan Tao
Dan: I like how you used TEnumerable but I don't like how that prevents C# from inferring the type, which is why I opted to use IEnumerable<V> instead. I suppose in production code I might provide both options.
Gabe
@Gabe: I agree: it's a pain to write out once the compiler can't infer the type. But I think it's the lesser of two evils (though I like your idea of providing both). Honestly, I seldom see variables declared as `Dictionary<T, IEnumerable<T>>` -- seems like asking for trouble, e.g., if you are adding the results of LINQ queries (which will be lazily evaluated) to your dictionary. I see things like `Dictionary<T, List<T>>` (for better or for worse) a lot more often.
Dan Tao