views:

108

answers:

5
A: 

The use of a dictionary to me implies that the intention is to find items by a key, rather than visit every item. On the other hand, 600ms for looping through a million items is respectable.

Perhaps alter your logic so that you can simply pick the relevant items satisfying the condition directly out of the dictionary.

Mitch Wheat
There's nothing wrong with looping over the contents of a dictionary, so long as you don't expect ordering and you don't try to modify it at the same time.
Jon Skeet
@Jon: the use of a dictionary to me implies that the intention is to find items by a key, rather than visit every item. Which is what I meant. Will edit.
Mitch Wheat
@Mitch: It means that *something* needs to be able to find items by key. That doesn't mean it's inappropriate to *also* use the same data structure for iteration.
Jon Skeet
@Jon: Yes, I understand that. If you look at someones elses code and see a Dictionary in use, what's the first thing you'd assume? I am not saying that it is necessarily inappropriate to also iterate over it.
Mitch Wheat
+2  A: 

My first suggestion would be to use just the values from the dictionary:

foreach (BusinessObject> value in _dictionnary.Values)
{
    if(value.MustDoJob)
    {
        value.DoJob();
    }
}

With LINQ this could be even easier:

foreach (BusinessObject value in _dictionnary.Values.Where(v => v.MustDoJob))
{
    value.DoJob();
}

That makes it clearer. However, it's not clear what else is actually causing you a problem. How quickly do you need to be able to iterate over the dictionary? I expect it's already pretty nippy... is anything actually wrong with this brute force approach? What's the impact of it taking 600ms to iterate over the collection? Is that 600ms when nothing needs to do any work?

One thing to note: you can't change the contents of the dictionary while you're iterating over it - whether in this thread or another. That means not adding, removing or replacing key/value pairs. It's okay for the contents of a BusinessObject to change, but the dictionary relationship between the key and the object can't change. If you want to minimise the time during which you can't modify the dictionary, you can take a copy of the list of references to objects which need work doing, and then iterate over that:

foreach (BusinessObject value in _dictionnary.Values
                                             .Where(v => v.MustDoJob)
                                             .ToList())
{
    value.DoJob();
}
Jon Skeet
Ok to make a copy of the objects which need work doing but is this not going to be time consuming ? Does the where statement be more effective that the manual foreach ?
Toto
@Toto: Using `Where` just makes it easier to read (IMO) - it won't make it cheaper. Note that it won't be copying *objects*, just *references*. As to how time-consuming that will be - obviously it depends on how many jobs need doing. It's likely to be a lot cheaper than the cost of actually doing the jobs though.
Jon Skeet
@Jon: When you said "It's likely to be a lot cheaper than the cost of actually doing the jobs though.", you are talking about the cost of finding the objects which need work doing , do you ? If so, why the cost is going to be a lot cheaper ?
Toto
@Toto: I was talking about the cost of copying references. If your actual work is doing *anything* across a COM boundary, I'd be incredibly surprised to find out that the cost of copying the references to a new list was a bottleneck.
Jon Skeet
+2  A: 

Try using a profiler first. 4 makes me curious - 600ms may not be that much if the COM object uses most of the time, and then it is either paralellize or live with it.

I would get sure first - with a profiler run - that you dont target the totally wrong issue here.

TomTom
I don't know what a profiler is, any link ? I supposed that it was the loop since I removed 200 000 objects from the list (whose MustDoJob == false and thus DoJob not called) and the total time deacreased significantly (around 100 ms). At least there is something to do with the loop improvment.
Toto
Hm, RedGate has a good profiler - ANTS (http://www.redgate.com/, they have a trial). Otherwise the Visual Dutio profiler itself (sorry, high end version only) is VERY good - especialyl the 2010 one.
TomTom
+5  A: 

What I would suggest is that your business object could raise an event to indicate that it needs to do a job when MustDoJob becomes true and you can subscribe to that event and store references to those objects in a simple list and then process the contents of that list when the DoAllJobs() method is called

RobV
Interesting idea, going to look deeper.
Toto
+1  A: 

Having established that the loop really is the problem (see TomTom's answer), I would maintain a list of the items on which MustDoJob is true -- e.g., when MustDoJob is set, add it to the list, and when you process and clear the flag, remove it from the list. (This might be done directly by the code manipulating the flag, or by raising an event when the flag changes; depends on what you need.) Then you loop through the list (which is only going to be 60-70% of the length), not the dictionary. The list might contain the object itself or just its key in the dictionary, although it will be more efficient if it holds the object itself as you avoid the dictionary lookup. It does depend on how frequently you're queuing 200k of them, and how time-critical the queuing vs. the execution is.

But again: Step 1 is make sure you're solving the right problem.

T.J. Crowder