views:

369

answers:

11

In the following examples:

  • the first seems more verbose but less wasteful of resources
  • the second is less verbose but more wasteful of resources (redefines string each loop)

Which is better coding practice?

First example:

using System;
using System.Collections.Generic;

namespace TestForeach23434
{
    class Program
    {
        static void Main(string[] args)
        {
            List<string> names = new List<string> { "one", "two", "two", "three", "four", "four" };

            string test1 = "";
            string test2 = "";
            string test3 = "";
            foreach (var name in names)
            {
                test1 = name + "1";
                test2 = name + "2";
                test3 = name + "3";
                Console.WriteLine("{0}, {1}, {2}", test1, test2, test3);
            }
            Console.ReadLine();
        }
    }
}

Second example:

using System;
using System.Collections.Generic;

namespace TestForeach23434
{
    class Program
    {
        static void Main(string[] args)
        {
            List<string> names = new List<string> { "one", "two", "two", "three", "four", "four" };

            foreach (var name in names)
            {
                string test1 = name + "1";
                string test2 = name + "2";
                string test3 = name + "3";
                Console.WriteLine("{0}, {1}, {2}", test1, test2, test3);
            }
            Console.ReadLine();
        }
    }
}
+2  A: 

Those are both wasteful and verbose.

foreach (var name in names)
{
   Console.WriteLine("{0}1, {0}2, {0}3", name);
}

.

</tongueincheek>
womp
Actually, `Console.WriteLine("{0}1, {0}2, {0}3", name)` would be my preference.
Michael Myers
+1 for that...!
womp
+21  A: 

The second form is no more wasteful - it's simply better.

There's no advantage to declaring the variables outside the loop, unless you want to maintain their values between iterations.

(Note that usually this makes no behavioural difference, but that's not true if the variables are being captured by a lambda expression or anonymous method.)

Jon Skeet
but if my foreach loop interates a million times, don't I take a performance hit by defining 3 million strings instead of 3 strings?
Edward Tanguay
You are still creating three million strings because String is an immutable type. When you do string + "1" you are effectively creating a brand new String object and assigning it to something, not modifying the original.
Josh
You get the same hit either way. If performance is the issue, then use a StringBuilder which is mutable and is actually reused.
awhite
@Edwad Tanguay, behind the scene, space for a variable is only created once for the function, and reused for every iteration. Even more exact, the underlying layer has no notion of scoping and loops besides function and class scope. And string concatenation in a loop is never good, use StringBuilder as those above told.
Dykam
But more generally, Jon, for other object types, the addresses would be reused (in case B), correct? Even if the objects are modified within the loop.
harpo
@harpo: Yes, that's right.
Jon Skeet
A: 

I'm not sure what you gain by defining the string variable outside of the loop. Strings are immutable so they are not reused. Whenever you assign to them, a new instance is created.

awhite
Interned strings are re-used -- of course, the calculated strings in this example will not be interned by default, so I take your point.
Eric Lippert
Thank you for the clarification.
awhite
A: 

I think it depends on what you are trying to solve. I like the second example, because you can move the code in one step. I like the first example, because it is faster due to less stack manipulations, less memory fragmentation, and less object construction/creation.

AMissico
+8  A: 

Personally, I think it's the best practice to declare variables in the tightest scope possible, given their usage.

This provides many benefits:

  1. It's easier for refactoring, since extracting a method is simpler when the variables are already in the same scope.
  2. The variable usage is more clear, which will lead to more reliable code.

The only (potential) disadvantage would be the extra variable declaration - however, the JIT tends to optimize this issue away, so it's one I wouldn't necessarily worry about in real work.

The one exception to this:

If your variable is going to be adding a lot of GC pressure, and if this can be avoided by reusing the same object instance through the foreach/for loop, and if the GC pressure is causing measured performance problems, I'd lift it into the outer scope.

Reed Copsey
You had me right up to that last paragraph. Object lifetime is an entirely different question from variable lifetime.
Jeffrey L Whitledge
@Jeffrey: Yes - and I wasn't talking about his specific example here, where it makes no (technical) difference. However, I was trying to say there are times when moving a variable into an outer scope can be a valid means of changing the object lifetime, to prevent adding unneeded system pressure due to repeated allocations.
Reed Copsey
@Reed Copsey - If a variable initialization allocates a new object on the heap, then moving that outside a loop could create fewer objects, but that is not the same as moving the variable declaration, or changing the variable's scope. There is a related issue with regards to GC pressure and variable scope, but it has the opposite effect than what you describe. A variable declared outside the loop has an expanded scope that may increase an object's lifetime unless the static analysis notes that the variable is not used after the loop completes.
Jeffrey L Whitledge
@Jeffrey: That was specifically why I included, in that statement: "can be avoided by reusing the same object instance through ..." I was making a general statement, in regards to my initial statement where I said to "declare variables in the tightest scope possible" - sometimes, moving to a larger scope means you can reuse a variable without reinitializing it, which can prevent multiple allocations from occurring (at the cost of having the object lifetime lengthened).
Reed Copsey
A: 

I've found that 'hoisting' declarations out of loops is typically a better long-term strategy for maintenance. The compiler will usually sort things out acceptably for performance.

Paul Nathan
A: 

Follow a simple rule while declaring variables

Declare it when you need it for the first time

Anantha Kumaran
Unless you're using C, pre-C99.
Michael Myers
I have found that using that principle is fine until you have larger-scale software: having a centralized place to find your variable decls becomes useful in maintenance.
Paul Nathan
@Paul Nathan - If you are having trouble finding where a variable is declared in a method, then your method is way, way too long!
Jeffrey L Whitledge
@Jeffrey: I think Paul's point stands when you're talking about private member variables in a class.
Jacob G
+1 I agree with you Anatha and Jeffrey.
Chris Marisic
@Jeffery: Probably, but we're not always given the choice to trash code handed to us and rewrite. IMO, better to push variables to the top of the routine as a standard.
Paul Nathan
@Paul I don't see how that offers any value, IMO that actually lowers debuggability because you might have to watch the variable from line 1 of the method instead just 3 lines before the end of the method. And I feel sorry for you about not being able to fix bad code I worked at a position like that I'm glad I'm in charge now at my current one and I always subscribe to fix things when you touch them.
Chris Marisic
@Chirs Marisic - Yes, fix it when you touch it! I used to add functionality to a piece of legacy code in which the method `OrderPost` was over 4000 lines long. I usually figured I could spend a week making and debugging the change, or I could spend three days refactoring and an hour making the change. It was an easy choice.
Jeffrey L Whitledge
@Chris: Well, it was a really gnarly research program with math I didn't fully understand, etc, etc. Simpler to decl at the top. :)
Paul Nathan
Sorry but in my opinion that is not a good rule to follow.Because when the thing you are building gets just a little complicated, you will, get confused.
Moulde
+1  A: 

Depending on language and compiler it may or may not be the same. For C# I expect the resulting code to be very similar.

My own philosophy on this is simple:

Optimize for ease of understanding.

Anything else is premature optimization! The biggest bottleneck in most development is time and attention of the developer. If you absolutely must squeeze out every last CPU cycle then by all means do so, but unless you have a compelling business need to do or are writing a critical component (common library, operating system kernel, etc) you are better off waiting until you can benchmark the finished program. At that time optimizing a few of the most costly routines is justified, before then it's almost certainly a waste of time.

Sorpigal
I agree with your statement, but when dealing with a loop you much consider the performance impact of your code. There is another question here, where re-initializing several variables in nested loop is taking thirty seconds.
AMissico
While I agree with the general point, I'd argue that optimizing for correctness is right up there with ease of understanding. So I'd ask "which one is easier to read" and "which one is less likely to cause weird side effects when modified in the future" well before "which is faster".
kyoryu
A: 

They are both virtually the same as far as performance goes (strings are immutable), but as for readability... I'd say neither is very good. You could easily just do it all within the Console.WriteLine.

Perhaps you can post the real problem instead of an example?

Doug Stalter
A: 

I generally declare variables as close to their use as scoping allows, which in this case would be your second example. Resharper tends to encourage this style as well.

kekekela
A: 

For POD type data declare closest to first use. For anything like a class that does any memory allocation, then you should consider declaring those outside of any loops. Strings will almost certainly do some form of allocation and most implementations (at least in C++) will attempt to reuse memory if possible. Heap based allocations can be very slow indeed.

I once profiled a bit of C++ code that included a class that new'd data in its ctor. With the variable declared outside of the loop it ran 17% faster than with the variable declared inside the loop. YMMV in C# so profile performance you may be very surprised at the results.

lilburne