views:

694

answers:

6

What's the best way to set all values in a C# Dictionary?

Here is what I am doing now, but I'm sure there is a better/cleaner way to do this:

Dictionary<string,bool> dict = GetDictionary();
var keys = dict.Keys.ToList();
for (int i = 0; i < keys.Count; i++)
{
    dict[keys[i]] = false;
}

I have tried some other ways with foreach, but I had errors.

A: 

what errors were you getting using foreach?

This should work:

foreach(var key in dict.Keys)
{
  dict[key] = false;
}
Randolpho
Changing the dictionary invalidates the enumerator, and continuing to use it will throw an `InvalidOperationException`.
280Z28
excellent point. Have a karma-less +1!
Randolpho
+9  A: 

That is a reasonable approach, although I would prefer:

foreach (var key in dict.Keys.ToList())
{
    dict[key] = false;
}

The call to ToList() makes this work, since it's pulling out and (temporarily) saving the list of keys, so the iteration works.

Reed Copsey
Looking at my original code, I don't know why I didn't try this.
Ronnie Overby
A: 

You could just pull out the ToList() and iterate directly over the dictionary items

Dictionary<string, bool> dict = GetDictionary();
foreach (var pair in dict) 
{
    dict[pair.Key] = false;
}
AgileJon
Changing the dictionary invalidates the enumerator, and continuing to use it will throw an `InvalidOperationException`.
280Z28
A: 

Do it the way you have it right now... foreach is slow. foreach may be cleaner, but you take too much of a performance hit using it.

Edit:
http://www.codeproject.com/KB/cs/foreach.aspx
http://www.madprops.org/blog/for-vs-foreach-performance/

Polaris878
How is 'foreach' slow?
Michael Donohue
See above. Sorry for not providing evidence in the original post.
Polaris878
The `foreach` method as listed in the first post *is* slow ... for manipulating the existing internal buckets of the dictionary.
280Z28
The links provided only point out that `foreach` is slower for arrays, not that it is slower for dictionaries (or in general). Furthermore, the first article is simply outdated, and no longer correct even when applied to arrays; and the second article actually measures the performance of `Enumerable.Range` rather than `foreach` as such.
Pavel Minaev
+1 for showing evidence but I disagree with this in practice. for loops in my experience cause more bugs and take more lines of whereas foreach is clearer and the performance hit is negligible for real applications. The benchmarks are on toy programs where the assembly gets highly optimized to deal with arrays of unboxed values AND THAT IS ALL THE CODE IS DOING. In real code it is not likely to effect your run times except inside of tight loops, in which case you are using unsafe/pointers anyway, right?
Jared Updike
This would be called premature optimization. It is doubtful that a reset algorithm is in the critical path of any algorithm
Michael Donohue
Its still going to be slower going over the list using foreach than going over the list with a for loop...
Polaris878
@Pavel... it is slower for anything using IEnumerable
Polaris878
Your links are not evidence. The first link is for an array, and does not even profile the difference. The second link is for Enumerable.Range... for dictionaries the result can be very different.
Meta-Knight
'but you take too much of a performance hit using it.' - you can't really say this without knowing the use case of the code. if it takes 10ms longer and is run once in the whole app run then the performance hit might not make any difference. if the loop is run 100 million times then _maybe_ you have an argument. -1 for premature optimisation.
Sam Holder
You don't need an article that pertains directly with Dictionary, since the underlying data structure uses IEnumerable... the findings are true for anything using IEnumerable. @bebop... yes this may not be run 100 million times, but in the managed code world I'm going to take whatever speed I can get.
Polaris878
And Reed's answer is converting the dictionary to a list anyways... which, lo and behold, uses an array behind the scenes. So if anything, these articles are perfectly relevant.
Polaris878
@Polaris - I just ran a test iterating through 1 million `Dictionary<string, bool>` records and changed all their values from `true` to `false`. The performance is nearly identical between `for` and `foreach` on my machine. Surprisingly, the `foreach` loop occasionally completed more quickly than the `for` loop. If however, I do not set any values inside the loop, the `for` loop will complete twice as quickly. This is meaningless though because the loop isn't actually doing anything.
John Rasch
Polaris, check out my answer.
Meta-Knight
Thanks John and Meta, but please note that Reed's and other answers are actually going over a list. But I probably stand corrected on the case when a Dict. is used.
Polaris878
Okay nevermind after reading Meta's answer. Sorry for sucking.
Polaris878
+2  A: 

If you aren't using tri-state bools, then you can use HashSet<string>, and call Clear() to set the values to "false".

280Z28
This is a good alternative
Michael Donohue
I am not using tri-state bools, but I don't understand how to implement what you are talking about. It looks cool, though!
Ronnie Overby
@Billy: If the string is in the `HashSet`, then it's true. If it's not in the `HashSet`, it's false. Use Add/Remove instead of setting true/false. This will be *very* fast, but there's no way to represent a third "missing" state.
280Z28
A: 

I profiled the difference between Billy's and Reed's solutions. Polaris878, take good note of the results and remember that premature optimization is the root of all evil ;-)

I rewrote the solutions in VB (because I'm currently programming in that language) and used int keys (for simplicity), otherwise it's the exact same code. I ran the code with a dictionary of 10 million entries with a value of "true" for each entry.

Billy Witch Doctor's original solution:

Dim keys = dict.Keys.ToList
For i = 0 To keys.Count - 1
    dict(keys(i)) = False
Next

Elapsed milliseconds: 415

Reed Copsey's solution:

For Each key In dict.Keys.ToList
    dict(key) = False
Next

Elapsed milliseconds: 395

So in that case the foreach is actually faster.

Meta-Knight