tags:

views:

219

answers:

5

This is more of an academic question about performance than a realistic 'what should I use' but I'm curious as I don't dabble much in IL at all to see what's constructed and I don't have a large dataset on hand to profile against.

So which is faster:

List<myObject> objs = SomeHowGetList();
List<string> strings = new List<string>();
foreach (MyObject o in objs)
{
    if (o.Field == "something")
        strings.Add(o.Field);
}

or:

List<myObject> objs = SomeHowGetList();
List<string> strings = new List<string>();
string s;
foreach (MyObject o in objs)
{
    s = o.Field;
    if (s == "something")
        strings.Add(s);
}

Keep in mind that I don't really want to know the performance impact of the string.Add(s) (as whatever operation needs to be done can't really be changed), just the performance difference between setting s each iteration (let's say that s can be any primitive type or string) verses calling the getter on the object each iteration.

+6  A: 

Most of the time, your second code snippet should be at least as fast as the first snippet.

These two code snippets are not functionally equivalent. Properties are not guaranteed to return the same result across individual accesses. As a consequence, the JIT optimizer is not able to cache the result (except for trivial cases) and it will be faster if you cache the result of a long running property. Look at this example: why foreach is faster than for loop while reading richtextbox lines.

However, for some specific cases like:

for (int i = 0; i < myArray.Length; ++i)

where myArray is an array object, the compiler is able to detect the pattern and optimize the code and omit the bound checks. It might be slower if you cache the result of Length property like:

int len = myArray.Length;
for (int i = 0; i < myArray.Length; ++i)
Mehrdad Afshari
Assuming that this compiles, the o.Field in both cases will be of type string. I'm also assuming that the value for o.Field, for each object in the collection, is set to some meaningful value. Maybe I don't quite understand what you mean; could you be more specific?
SnOrfus
Check out the link in my updated answer. It's a specific example where it makes significant difference if you cache the return value.
Mehrdad Afshari
+2  A: 

Storing the value in a field is the faster option.

Although a method call doesn't impose a huge overhead, it far outweighs storing the value once to a local variable on the stack and then retrieving it.

I for one do it consistently.

Tor Haugen
+2  A: 

Generally the second one is faster, as the first one recalculates the property on each iteration. Here is an example of something that could take significant amount of time:

 var d = new DriveInfo("C:");
 d.VolumeLabel; // will fetch drive label on each call
Piotr Czapla
+4  A: 

It really depends on the implementation. In most cases, it is assumed (as a matter of common practice / courtesy) that a property is inexpensive. However, it could that each "get" does a non-cached search over some remote resource. For standard, simple properties, you'll never notice a real difference between the two. For the worst-case, fetch-once, store and re-use will be much faster.

I'd be tempted to use get twice until I know there is a problem... "premature optimisation", etc... But; if I was using it in a tight loop, then I might store it in a variable. Except for Length on an array, which has special JIT treatment ;-p

Marc Gravell
@Marc: Isn't the problem with the first code snippet that `o.Field` could actually change value between testing it against "something" and adding it to the `List`? `o.Field == "something"` might evaluate true, but by the time you call `strings.Add` you're adding "something else"?
Grant Wagner
@Grant - oh absolutely it could, but again it would be... non-standard - or at least, should be well documented. If it is because of threading then we've only ourselves to blame, of course.
Marc Gravell
@Marc: I don't say it's a premature optimization, especially when you're dealing with stupid non-O(1) properties (a lot of them exist in WinForms) like the one I linked in my answer. Also, in multithreaded scenarios, you might prefer to keep results for the sake of correctness.
Mehrdad Afshari
@Grant, that would be very bad practice by the author of the property, unless it's a thread-safe, shared resource. In any practical case, you'll know when to suspect such behavior.
Tor Haugen
+8  A: 

Your first option is noticeably faster in my tests. I'm such flip flopper! Seriously though, some comments were made about the code in my original test. Here's the updated code that shows option 2 being faster.

    class Foo
    {
        public string Bar { get; set; }

        public static List<Foo> FooMeUp()
        {
            var foos = new List<Foo>();

            for (int i = 0; i < 10000000; i++)
            {
                foos.Add(new Foo() { Bar = (i % 2 == 0) ? "something" : i.ToString() });
            }

            return foos;
        }
    }

    static void Main(string[] args)
    {

        var foos = Foo.FooMeUp();
        var strings = new List<string>();

        Stopwatch sw = Stopwatch.StartNew();

        foreach (Foo o in foos)
        {
            if (o.Bar == "something")
            {
                strings.Add(o.Bar);
            }
        }

        sw.Stop();
        Console.WriteLine("It took {0}", sw.ElapsedMilliseconds);

        strings.Clear();
        sw = Stopwatch.StartNew();

        foreach (Foo o in foos)
        {
            var s = o.Bar;
            if (s == "something")
            {
                strings.Add(s);
            }
        }

        sw.Stop();
        Console.WriteLine("It took {0}", sw.ElapsedMilliseconds);
        Console.ReadLine();
    }
Andy Gaskell
Hey, +1 for writing a test! I'd give you +2 if I could..
Tor Haugen
Andy: Which runtime/platform did you test on? Also, is it a release build?
Mehrdad Afshari
For me, the result is: 2294, 702 which contradicts your conclusion.
Mehrdad Afshari
I just ran it a few times locally - debug/3.5/Win7 64. The gap is usually 10% but sometimes goes up close to 100%. I'll try release.
Andy Gaskell
I reworked the code a bit to show some actual meaningful results by encasing the entire thing in a larger for loop and running the tests 1000 times. Also, no need to create a brand new list every single time, just clear the existing list (this also made it use ~250MB less RAM, and kept RAM use constant). This probably has the added benefit if getting rid of the list resizing. I also switched the output to only display an average at the very end of the runs.This was Debug/3.5/Vista 64. I'll post the results once it finishes up.
chsh
The average of all runs was 311 for the first item, and 306 for the second. The maximum for the first was 435, the second was 419. I didn't bother capturing the minimum.I think the wild results you were seeing were caused by the fact that the new List<string>() is taking place after the stopwatch has started, so the allocation of the space for the list is the real culprit in terms of inconsistency.
chsh
The results I mentioned were from Mono/OS X on my Mac Book Air. On Win7/64/Release I don't see any noticeable difference (I got rid of the list and called a dummy function that incremented an int instead.)
Mehrdad Afshari
@chsh: You're right - I should have taken the string stuff outside of the StopWatch. Calling Clear instead of newing up a List made a huge difference. I'll update the code.
Andy Gaskell
For what it's worth, I ran one under release and wound up with:A average: 286 A max: 510B average: 268 B max: 481I'm guessing the maximums are so much higher because the first time through the list will have to size itself up.
chsh
Andy: I'd still say it's hardly *noticeably faster*. For most trivial properties, the JIT compiler will be able to generate pretty much identical code. It's hardly noticeable (at least on release builds.)
Mehrdad Afshari