views:

59

answers:

2

I'm writing a tool, and the first part of that tool is to collect all the header files in our public API. The problem is, two of the header files have duplicate file names (but they reside in different folders). This will cause a problem when creating a dictionary.

Originally I wrote a foreach loop to collect FileInfo instances into a dictionary. However lately I'm learning LINQ, and I wanted to convert the foreach loop into a LINQ statement. The problem is when it executed, it complained about the duplicate file name.

Here is the original code:

public Dictionary<String, FileDependency> GetSDKFiles(DirectoryInfo dir)
{
    Dictionary<String, FileDependency> list = new Dictionary<String, FileDependency>();
    foreach (FileInfo info in dir.EnumerateFiles("*.h", SearchOption.AllDirectories))
    {
        String key = info.Name.ToLower();
        if (list.ContainsKey(key) == false)
        {
            list.Add(key, new FileDependency(info.FullName));
        }
        else
        {
            Debug.Print("Duplicate key: {0}", info.Name);
            Debug.Print("  File: {0}", info.FullName);
            Debug.Print("  Have: {0}", list[key].FullFileName);
        }
    }

    return list;
}

Which I tried turning into LINQ like so:

public Dictionary<String, FileDependency> GetSDKFilesLINQ(DirectoryInfo dir)
{
    var files = from info in dir.EnumerateFiles("*.h", SearchOption.AllDirectories)
                let key = info.Name.ToLower()
                let dep = new FileDependency(info.FullName)
                select new { key, dep };
    return files.ToDictionary(v => v.key, v => v.dep);
}

However at runtime I get this:

An item with the same key has already been added.

In the foreach loop it was easy to avoid that, since I called the ContainsKey method to make sure I had no duplicates. But what is the LINQ equivalent?

Do I use where? - How? Do I use group? - How?

Thanks.

+4  A: 
var files = dir.EnumerateFiles("*.h", SearchOption.AllDirectories)
               .GroupBy(file => file.Name.ToLower())
               .Select(group => new {Key = group.Key, Value = group.First()})
               .ToDictionary(a => a.Key, a => new FileDependency (a.Value.FullName));

If you have MoreLinq, you can do:

var files =  dir.EnumerateFiles("*.h", SearchOption.AllDirectories)
                .DistinctBy(file => file.Name.ToLower())
                .ToDictionary(file => new FileDependency (a.Value.FullName));

Alternatively, you can write your own IEqualityComparer implementation for the files and use the standard Distinct method. The whole problem here is that Distinct (at least as of .NET 3.5) doesn't come with an overload that allows for inserting your own definition of "distinctness" as a lambda expression.

Ani
+1 for non-LINQ answer ;)
Kirk Woll
This works too, exactly the same result as the foreach loop.
C Johnson
+1  A: 

You could group by key and take the first value from the group for dep:

public Dictionary<String, FileDependency> GetSDKFilesLINQ(DirectoryInfo dir)
{
    var files = from info in dir.EnumerateFiles(
                    "*.h", SearchOption.AllDirectories)
                let key = info.Name.ToLower()
                let dep = new FileDependency(info.FullName)
                group dep by key into g
                select new { key = g.Key, dep = g.First() };
    return files.ToDictionary(v => v.key, v => v.dep);
}

That will silently ignore duplicates. Alternately, you could use a Lookup instead of a Dictionary:

public ILookup<String, FileDependency> GetSDKFilesLINQ2(DirectoryInfo dir)
{
    var files = from info in dir.EnumerateFiles(
                    "*.h", SearchOption.AllDirectories)
                let key = info.Name.ToLower()
                let dep = new FileDependency(info.FullName)
                select new { key, dep };
    return files.ToLookup(v => v.key, v => v.dep);
}

The indexer on the lookup will return an IEnumerable<FileDependency>, so you can see all the values.

Quartermeister
I've tried your approach, and it works. Thanks so very much!
C Johnson