views:

530

answers:

16

I got a question for you guys on C# code writing: I had a long debate with collegue of mine: He says that by declaring and initilizing all variables in the beginning of the function make code more readable and fast.

Please give me at least one sane reason that it could be right.

1. It's not readable as if you have a long function, you have to have huge declaration scope after intialization scope, so this is much easier to declare the variable you need where you need it, so you will see it in fron of you.

2. It's not absolutely faster, as you allocate unnecessary memory and pump the function stack.

This pseudo code example mine:

public void A(double dParam) 
{         
    if(... condition ... ) {
        double dAnotherParam;
        string sParam; 
        ...
        // use local scope vars here
     }   
}

Pseudo code example him:

public void A(double dParam) 
{
    double dAnotherParam;
    string sParam; 

    dAnotherPatam = 0; 
    sParam = null;     

    if(... condition ... ) {
        ...
    }                   
 }

Any ideas on topic?

Thank you in advance.

+1  A: 

declare variables in the scope you intend to use them.

Nico
I know :) Can you give me another reason then mine ?
Tigran
@Tigran, the further out you scope the variables, the more likely those variables will be accidentally misused elsewhere in your code.
Kirk Woll
+7  A: 

It's quite simple and straightforward. This rule is what I use in my code all day every day.

Declare variables at the lowest scope you can get away with.

If you're only using a variable inside of a method, declare in the method, no need to place it at the class level.

Edit: You seem to be asking about scope variables in methods specifically.

In that case think about it this way, if your usage of a variable depends on whether or not a condition is true, why declare it before you even pass the condition. You'd be wasting resources by doing so.

Serg
May be I'm not explained well. Here I'm not talking about the class scope, I'm talking about method scope. Declare all variable you use in method in the begining of the method itself.
Tigran
Oh, that wasn't quite clear from the question. In that case yes, declaring them at the top of the method seems like the normal thing to do. Then again, how would that influence aesthetics? If you try to make your methods smaller and smaller, you won't run into these types of problems.
Serg
That's a rule, not a reason...
Bruno Brant
Edited my answer with a reason! :D
Serg
What about with loops instead of if-statements? I often declare variables outside of a loop scope if I intend to reuse the variable often in the loop, but I'm not sure if this actually reuses the memory location or if it really has no effect. Like if I had a for-loop that added objects to a list, I might use the same variable to hold each object while it was created during the loop.
CodexArcanum
@CodexArcanum: I think since strings are immutable, you're just creating new variables either way.
Serg
@Sergio With strings or objects I'm almost certainly just making new junk on the heap, but because I'm reusing the address location it gets opened up to GC faster. With value-types, I'd actually be reusing the stack memory location so I'd think that's a small efficiency boost. Maybe I should ask/search-for a new question?
CodexArcanum
@CodexArcanum: Go for it! But share the link here so I can learn as well.
Serg
As Pieter points out, the compiler will optimize away this sort of thing, so there is no performance difference based on where you declare a variable. The data may be stored in a CPU register, or in a memory location, but this will be based on what is done with the variable, and not where it is declared.
StriplingWarrior
@Sergio: It's not true that variables declared on closer scopes will make a difference. In fact (as you can see from my answer below), even if you declare a variable inside an IF scope, the compiler will allocate it at the beginning of the method.
Bruno Brant
@Bruno: There might not be a performance issue, but there is a problem with maintainability.
Serg
Ok, I looked around and that has been asked before. Here's two questions on it: http://stackoverflow.com/questions/1884906/ and http://stackoverflow.com/questions/3164324/ The first link answers it pretty well: nope, no difference. The compiler will, in most cases, create the same IL for variable in the loop or out. So it's just a style issue and general consensus seems to be that putting the variable inside the loop is clearer.
CodexArcanum
+3  A: 

In terms of the code running faster with all the variables declared at the top of the routine, I would argue that the oppposite is the case, if there is any difference at all. I am no expert on this, but I assume that garbage collection will be more efficient by declaring variables at the lowest scope you can. That way cleanup can occur sooner.

Additionally, by putting variables close to the code that uses them, it improves readability IMO.

An analogy is that it is the difference between having margin notes (put right beside the text you are reading), vs. a wall of footnotes at the bottom of the page (you are not sure what footnote relates to what text without doing some cross-referencing yourself).

RedFilter
It's not a GC issue, as much as a lack of creation issue. Value types don't affect the GC here, for example, but they do add perf. overhead from being created on the stack.
Reed Copsey
@Reed: Thanks for the clarification.
RedFilter
Pieter correctly points out that there is no difference in performance. The compiled code reflects how the variable is used, rather than where it is declared. The stack will provide enough space to accommodate the maximum number of values that it might need to store simultaneously in that function. The compiler is smart enough to figure out when you stop using one variable and start using another.
StriplingWarrior
@StriplingWArrior: That makes perfect sense, thanks for the coment.
RedFilter
+3  A: 

Performance doesn't matter because the compiler will take care of this. The compiler doesn't care whether you declare them at the top of the method or just before you use them.

However, having 20 declarations at the top instead of declaring the variable when you use it doesn't make the code more readable.

I would absolutely go for the first option.

Pieter
Hmm, are you sure about performance ? I mean this is "pseudo code". But we have structures declaration in the begining of the function. This, if I'm not mistaken, leads to memory allocation and push on call stack. Doesn't it ?
Tigran
You have no idea how much the compiler moves your code around if it thinks it can improve performance :). Yes, there are limitations, but the compiler does take care of a lot of this stuff. Besides that, you should try not to compromise readability of your code if you can think you can improve performance with this. If you are worried about performance, do a few tests. Make examples of both functions and test what is faster,
Pieter
+1 because you're right about the performance (factually) as well as readability (in my opinion). Stack frames don't change size after the method's been entered.
Rob Fonseca-Ensor
I think, at least in some implementations, there are performance implications for this, at least in the case that you are using using multiple anonymous methods. At least, that's my interpretation of http://stackoverflow.com/questions/3885106/discrete-anonymous-methods-sharing-a-class/3885161#3885161 . Though the case I am imagining probably implies that moving methods into lower scope will improve performance. This is something one could play with by examining the IDL. Start with readability until you fail performance metrics and this is the bottleneck, though.
Brian
+3  A: 

Declaring variables at the top of a method is very old school and is a hang over from C. I was taught this exact same thing at University.

Now days we try to keep methods very small and lean with meaningful parameter and method names. It is much cleaner to declare and initialize a variable inside the scope in which it is going to be used. If there are many such areas of code with in a given method it may be an indication that you should refactor out to smaller easier to read methods.

The only time I would consider doing this is when you want to initialize something inside a try catch finally statement where you want to do something with the object with in the catch or finally block.

SomeObject someObject = null;

try
{
    someObject = GetSomeObject();
    return someObject.DoSomething();
}
finally
{
    if (someObject != null) someObject.DoSomethingElse();
}

Most of the time all you want want to do is clean up the object so a using statement is more than adequate but there are times when it is needed.

Bronumski
+1  A: 

To add another point, you may not always want to initialise variables that are declared at class level. Sometimes you may want these things to be null and you actively test for that.

Also, you mention you have very long methods... methods should be functional, do a particular purpose and then return to the program flow. So long methods often can refactored into smaller manageable code.

jimplode
+24  A: 

It sounds like your colleague is an old C programmer. It used to be that you had to place all of your variables at the beginning of the scope, and people got used to doing it that way.

In C#, it's better to place them at the scope where they are needed. This has a few benefits, including:

  • You reduce the risk of error from reusing a variable inappropriately, especially during long term maintenance
  • You are keeping the variable constrained within that scope, which eases refactoring

The last point is, in my opinion, the most important. Good maintainability and testability rely on the abiltiy to keep your methods small - and keeping variables declared as they're used makes extracting a method much, much simpler. Not only do the tools become more effective, the resulting automatic refactorings tend to be simpler (as the variable isn't passed in by ref, etc).


As for your example code - I personally would do it differently than either of you. You're still declaring your variable at the beginning of the inner scope. I wouldn't declare them until they are actually used:

public void A(double dParam) 
{         
    if(... condition ... ) {
        double dAnotherParam = GetValueFromMethod();

        // use local scope vars here

        // Later, as needed:
        string sParam = dAnotherParam.ToString(); 
     }   
}

Declaring these at the top of the scope (even an inner scope), again, reduces the ability of automatic refactoring tools from successfully extracting methods as needed.

Reed Copsey
Like this comment, thank you.
Tigran
@Tigran: I edited to add more details - I would take it even further than your code did...
Reed Copsey
I mean , I know that decalring the variable in the begiining of the method is C style, but correct me if I'm mistaken, this is not a "best practises guideline", but something that you have to do, as C compiler forced you to do in that way. Isn't it ?
Tigran
@Tigran: Yes. It was part of the language specification. However, many C developers learned it as "gospel", and have held onto the (mistaken) belief that it's advantageous, not just a requirement in an older, less flexible language.
Reed Copsey
+1: good response
RedFilter
@Reed: yeah sure, agree with you it's actually usual way I code, I didn't put too much details in the pseudo code. Good comment ! Thanks !
Tigran
@Tigran: At least you can point your colleague here to this question - I don't think there's a single person on "his side" ;)
Reed Copsey
I'd give you +1 except that the first bullet point is not correct. At runtime, the stack and behavior of the method will look exactly the same regardless of where you declare your variable. Variables are just a conceptual placeholder to help us programmers. They don't actually mean anything in the compiled code. They aren't "created" in the sense you use it here.
StriplingWarrior
@StriplingWarrior: True. I was thinking more in terms of initializing the variable prematurely, which depending on the type, could have an effect (in particular, if it's a reference, and you construct some object based on it). However, I removed that entirely, as you are completely correct.
Reed Copsey
@StriplingWarrior +1 for pointing out that there is no performance difference between declaring all variables at the top of method or later.
BurningIce
That is a good point: by declaring the variable sooner, you risk initializing it sooner as well, which is something the compiler probably won't dare try to optimize away. At any rate, I think this is the best answer now.
StriplingWarrior
A: 

In my opinion, the declaration at the beginning of the function is way less readable than the declaration at the lowest scope possible. You should pay attention that your code is readable as this is much more important than this micro speed optimization.

Marius Schulz
+1  A: 

Re: readability, you're quite right. And also consider that declaring variables at a higher scope loses any information about where they're relevant to. A user looking at it needs to read the entire method to see where a variable is used. Furthermore, there's more risk of mistakenly using the wrong variable.

Andy
A: 
  1. It's not readable as if you have a long function, you have to have huge declaration scope after intialization scope

This just indicates that you need to refactor the code. In fact, it's good that forward declarations force you to think this stuff through.

It's not absolutely faster, as you allocate unnecessary memory and pump the function stack.

Ideally, you don't. If some branch of your code needs to allocate objects which are orthogonal to the rest of the function, it's better to offload this branch into a separate function.

This way you keep to the "declare where you use it" principle, too.

himself
A: 

It's not readable as if you have a long function, you have to have huge declaration scope after intialization scope, so this is much easier to declare the variable you need where you need it, so you will see it in fron of you.

I am not understanding how that would be more readable either. It moves the declaration away from the section of code that intends to use it resulting in the loss of proximity clues.

It's not absolutely faster, as you allocate unnecessary memory and pump the function stack.

The argument that it is faster is also wrong. It does not matter where in the method a variable is declared it still takes the same amount of space on the stack. So there is not any fundamental differences in performance that I am aware of1. However, you could make the argument that declarations with inline initializations would consume extra CPU cycles if the initialization is deemed to be unnecessary. For example, you might declare and initialize a variable inside an if block in which case the initialization occurs only if the execution dives into that block. Contrast that with what would happen if you did the declaration and initialization at the top of the method.

1I suppose there could some weird consequences of method inlining or the like that could effect this, but I doubt it.

Brian Gideon
Ok, I mean if you declare the variable in conditinal scope within the function, the memeory it allocates will be allocated only if during the execution of the program it jumps into that specified condition. If you declare always ALL variables on the top of the functions, they will all always allocated whatever condition is. I think it has performance impact by the way, also if the allocation on stack in .NET is amazingly fast.
Tigran
@Tigran: No, see Bruno's answer. The variable itself will always have a spot on the stack no matter where it is declared. Its the initialization that matters. So if the variable happen to be assigned a reference to a new object then space for the reference always exists, but the space for the object goes on the heap and only occurs if the initialization occurs. Does that make sense?
Brian Gideon
@Brian. Yeah it makes perfect sense to me and I've already commented on it, but got one doubt during the writing actually. I noticed that the code used by @Brno used first class citizens, don-t know if the user objects will have the same treatment from compiler actually. Will verify and respond...
Tigran
@Tigran: I'm not sure about that either. I also wonder how some of the JIT optimizations might effect this as well. I specifically mentioned how method inlining might play into this as well.
Brian Gideon
@Brian. I put a comment on the @Bruno's post after that I dump to IL some code that looks like his code, but using class and a struct.
Tigran
@Tigran: Yeah, sorry I didn't fully comprehend what you were talking about earlier. You make a good point about value types. Specifically, that they always have a default value. When does that default value get applied? Before the method executes? Or at the point of declaration in the C# code?
Brian Gideon
@Brian. It seems to me, after looking at the IL code, that if I use a structure in the function, in any scope, it, by the way, will be allocated and initialized, as the value types are allocated on the stack, and when I call a function I must know how much space I need on stack, so the only way to do it, scan the code of the function, find all first class citizens and reference types and allocate 4 bytes for them, after find all value types, allocate them and intialize, in this case I'm sure to have no less no more but exact space I need in order to execute a procedure. The only expl. I found.
Tigran
A: 

A rule of thumb I have always followed is to populate variables as close to their declaration as possible, this means dont declare a variable on line 10 then give it an initial value on line 100.

kyndigs
+7  A: 

This is a matter of style. My considerations:

  1. If the code block is small, in the end it won't matter where you place the declaration. However, it's preferable to have declarations nearer use, mainly in large code blocks. First, this way you can make sure that the variable wasn't modified until that point in the block, and second, because you can check the type and initialization of the variable faster.

  2. Besides, its faster to code by declaring variables near where you use them -- you don't have to scroll back to the beginning of the scope just to declared a forgotten variable and then move back.

  3. This isn't faster or slower. Variables, no matter where declared in the scope, will have their allocation and initialization centered at the beginning of the scope. This can be checked by generating IL for the language.

For example, the code:

static void Main(string[] args)
{
    var a = 3;

    if (a > 1)
    {
        int b = 2;
        a += b;
    }

    var c = 10;
    Console.WriteLine(a + c);   
 }

Generates the following IL:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       24 (0x18)
  .maxstack  2
  .locals init ([0] int32 a,     // Declare a
           [1] int32 b,          // Declare b
           [2] int32 c)          // Declare c
  IL_0000:  ldc.i4.3             
  IL_0001:  stloc.0              
  IL_0002:  ldloc.0              
  IL_0003:  ldc.i4.1             
  IL_0004:  ble.s      IL_000c
  IL_0006:  ldc.i4.2
  IL_0007:  stloc.1
  IL_0008:  ldloc.0
  IL_0009:  ldloc.1
  IL_000a:  add
  IL_000b:  stloc.0
  IL_000c:  ldc.i4.s   10
  IL_000e:  stloc.2
  IL_000f:  ldloc.0
  IL_0010:  ldloc.2
  IL_0011:  add
  IL_0012:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_0017:  ret
} // end of method Program::Main

One important thing to notice is that variables in internal scopes, such as the variable b, which was at the if scope, is declared at method scope by the compiler.

Hope it helps.

Bruno Brant
thanks, good explanation. So it seems to me that the memory is allocated by the way for all variables present in the function, either they may used or not. What makes diference, instead, is initialization of them. Thank you.!
Tigran
After analyzing an IL dump of the code, which is very similiar to your except that I use class and struct objects, the result is more or less the same. For first class citizen is allocated predefined space, for reference types pointer (4bytes), for value types (all of them even declared in the function's internal scope) allocated memory and intilaized (Doubt a little bit on intilization. Where actually the structure object declared in the function has become initialized ? Where we declare it, or were we first time use it ?
Tigran
@Tigran: Local variables are never automatically initialized. The only time a local variable is assigned a value is when you explicitly do so. @Bruno: +1 for noting that no matter where variables are declared, storage from them is always allocated on loading the stack frame.
Joren
@Joren. Got it, thank you !
Tigran
@Joren: thanks. I saw it by accident on one article here at SO.
Bruno Brant
@Tigran, one thing, the compiler may optimized away variables that aren't used in that scope. The code above was compiled using Debug settings, which means no optimization. Still, you can tell your friend that doesn't really matter where he declares variables, at least in compiler/performance terms.
Bruno Brant
A: 

First, its not recommended to have a big functions. If u declare a variable in scope which u want to use is better because no extra initialization and smallest assembly jumps required so causes to faster running.

SaeedAlg
A: 

Your colleague's point is from old school programming. Not that it is bad, but this comes from assembly language.

At the beginning of an assembly code, you had to declare the DS register (Data Segment) which made some allocation on the stack in order to reserve some memory. Here is a code sample:

.MODEL SMALL

.DATA 
    myVariable1 DW "This is my first variable"
    myVariable2 DB "Second var"

.CODE 'code segment

:ProgramStart

    MOV AX,01h
    ...

:ProgramEnd

As the others have already said, in C programs, the variable declarations had to be in the beginning of a function. Just like you needed to write your functions definitions in the beginning of your C program, before the main() function, then writing their body after your main function, otherwise your function (or variable) was not recognized. So yes, this was faster because the variable address was already known by the generated inline code.

However, with the coming of OOP and event programming, we tend to stay as close to the event as possible. What I mean resumes itself being your point of view. Today's compilers parse the code for such variables throughout the code and make it a manageable CLR code. So this no longer make a difference, speaking of performance. In today's programming, we tend to make the code as easy to read and understand as possible. That said, this includes variable declarations.

Declaring a variable in the beginning of a function, and using it only 20 lines further is no use, plus it tends to make the program harder to understand. For example, you're analyzing a piece of code, then hop! a variable name appears, but no declaration around. Where this variable comes from, shall you ask? You will need to scroll up (or go to the definition) to find out what is this variable. Meanwhile, you may have lost your idea about the purpose of the routine you were actually reading and trying to understand. Make it three or more times, and your lost trying to understand the code.

It's better to have a lean and swift point of view of the purpose of the code. That is, don't declare any unecessary variable before it really is indispensable. Writing this way will avoid one always scrolling up and down while reading or writing the code of a program.

This is my two cents. I hope to be clear enough in what I have been trying to explain.

I hope this helps anyway!

Will Marcouiller
Thank you, understood your point. Good explanation.
Tigran
A: 

Pascal was one of the later languages I learnt, where everything had to be declared up-front as per your buddie's example. Forgive us, earlier languages didn't allow variables with close scope à la .NET, so "old habits die hard".

Performance is no longer an issue, the cost in programming is manpower. Compilers have become cleverer than most of us, so the real issue is avoiding errors. (Forgive me for illustrating in VB, I simply make less typos.)

Compare:

Dim X as integer = 0
' 100 lines of drivel, where I'm forgetting about X
For X = 1 To 10
  If pornographic(X) then ' Pron integers, geek alert!
    Viagra(X)
  End if
Next
' 50 lines of drivel, X lurks with his raincoat

If X >= 10 Then ' Compiles. X=11, if you're damn lucky
  ' make a nasty mistake
End if

Don't laugh, it happens every day.

With:

' 100 lines of drivel, nothing lurid in sight
For X As Integer = 1 To 10
  If pornographic(X) then ' Pron integers, geek alert!
    Viagra(X)
  End if
Next
' 50 lines of drivel, X is no more

If X >= 10 Then ' *** FAILS AT COMPILE-TIME ***
  ' *** CAN'T *** make a nasty mistake
End if

Easy answer, right?

Maurice

P.S. I forgot to declare: Public Protected Viagra(clipno As Film) ...

smirkingman