views:

162

answers:

10

This isn't meant to be subjective, I am looking for reasons based on resource utilisation, compiler performance, GC performance etc. rather than elegance. Oh, and the position of brackets doesn't count, so no stylistic comments please.

Take the following loop;

Integer total = new Integer(0);
Integer i;
for (String str : string_list)
{
    i = Integer.parse(str);
    total += i;
}

versus...

Integer total = 0;
for (String str : string_list)
{
    Integer i = Integer.parse(str);
    total += i;
}

In the first one i is function scoped whereas in the second it is scoped in the loop. I have always thought (believed) that the first one would be more efficient because it just references an existing variable already allocated on the stack, whereas the second one would be pushing and popping i each iteration of the loop.

There are quite a lot of other cases where I tend to scope variables more broadly than perhaps necessary so I thought I would ask here to clear up a gap in my knowledge. Also notice that assignment of the variable on initialisation either involving the new operator or not. Do any of these sorts of semi-stylistic semi-optimisations make any difference at all?

+7  A: 

The second one is what I would prefer. There is no functional difference other than the scoping.

Setting the same variable in each iteration makes no difference because Integer is an immutable class. Now, if you were modifying an object instead of creating a new one each time, then there would be a difference.

And as a side note, in this code you should be using int and Integer.parseInt() rather than Integer and Integer.parse(). You're introducing quite a bit of unnecessary boxing and unboxing.


Edit: It's been a while since I mucked around in bytecode, so I thought I'd get my hands dirty again.

Here's the test class I compiled:

class ScopeTest {
    public void outside(String[] args) {
        Integer total = 0; 
        Integer i; 
        for (String str : args) 
        { 
            i = Integer.valueOf(str); 
            total += i; 
        }
    }
    public void inside(String[] args) { 
        Integer total = 0; 
        for (String str : args) 
        { 
            Integer i = Integer.valueOf(str); 
            total += i; 
        }
    }
}

Bytecode output (retrieved with javap -c ScopeTest after compiling):

Compiled from "ScopeTest.java"
class ScopeTest extends java.lang.Object{
ScopeTest();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public void outside(java.lang.String[]);
  Code:
   0:   iconst_0
   1:   invokestatic    #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   4:   astore_2
   5:   aload_1
   6:   astore  4
   8:   aload   4
   10:  arraylength
   11:  istore  5
   13:  iconst_0
   14:  istore  6
   16:  iload   6
   18:  iload   5
   20:  if_icmpge       55
   23:  aload   4
   25:  iload   6
   27:  aaload
   28:  astore  7
   30:  aload   7
   32:  invokestatic    #3; //Method java/lang/Integer.valueOf:(Ljava/lang/String;)Ljava/lang/Integer;
   35:  astore_3
   36:  aload_2
   37:  invokevirtual   #4; //Method java/lang/Integer.intValue:()I
   40:  aload_3
   41:  invokevirtual   #4; //Method java/lang/Integer.intValue:()I
   44:  iadd
   45:  invokestatic    #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   48:  astore_2
   49:  iinc    6, 1
   52:  goto    16
   55:  return

public void inside(java.lang.String[]);
  Code:
   0:   iconst_0
   1:   invokestatic    #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   4:   astore_2
   5:   aload_1
   6:   astore_3
   7:   aload_3
   8:   arraylength
   9:   istore  4
   11:  iconst_0
   12:  istore  5
   14:  iload   5
   16:  iload   4
   18:  if_icmpge       54
   21:  aload_3
   22:  iload   5
   24:  aaload
   25:  astore  6
   27:  aload   6
   29:  invokestatic    #3; //Method java/lang/Integer.valueOf:(Ljava/lang/String;)Ljava/lang/Integer;
   32:  astore  7
   34:  aload_2
   35:  invokevirtual   #4; //Method java/lang/Integer.intValue:()I
   38:  aload   7
   40:  invokevirtual   #4; //Method java/lang/Integer.intValue:()I
   43:  iadd
   44:  invokestatic    #2; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
   47:  astore_2
   48:  iinc    5, 1
   51:  goto    14
   54:  return

}

Contrary to my expectations, there was one difference between the two: in outside(), the variable i still took up a register even though it was omitted from the actual code (note that all the iload and istore instructions point one register higher).

The JIT compiler should make short work of this difference, but still you can see that limiting scope is a good practice.

(And with regards to my earlier side note, you can see that to add two Integer objects, Java must unbox both with intValue, add them, and then create a new Integer with valueOf. Don't do this unless absolutely necessary, because it's senseless and slower.)

Michael Myers
great answer, thanks.
Simon
@mmyers and how does it looks like when JIT is done with it?
stacker
Great answer; would give you +10, if I could. Esp. noticing the wasteful use of boxing/unboxing - a new trap for Java programmings.
Software Monkey
Further, why use an inner variable at all instead of `total+=Integer.valueOf(str);` ???
Software Monkey
@stacker: In non-debug builds of the JVM, I don't think there's an easy way to get at the generated assembly.@Software Monkey: No need, but it makes no difference at all (I tried it).
Michael Myers
A: 

It makes no significant difference apart from on the last iteration, when the reference is cleared quicker in the second example (and that would be my preference - not so much for that reason, but clarity.)

Keep the scope to the minimum possible. The hotspot VM does escape analysis to determine when references are no longer accessible, and on the basis of this allocates some objects on the stack rather than on the heap. Keeping scope as small as possible aids this process.

I would ask why you're using Integer instead of a simple int...or perhaps it's just by way of example?

mdma
A: 

The second is far better. Scoping variables as narrowly as possible makes the code far easier to read and maintain, which are much more important overall than the performance differences between these examples, which are trivial and easily optimized away.

JSBangs
stylistic, not resource related
Simon
A: 

Neither. Integer.valueOf(0); will use a reference to a cached 0. :)

Affe
+4  A: 

The second one is far better because the first style is should only be used in C code as its mandatory. Java allows for inline declarations to minimize the scope of variables and you should take advantage of that. But you code can be further improved:

int total = 0;
for (String str: stringList) {
    try {
        total += Integer.valueOf(str);
    } catch(NumberFormationException nfe) {
        // more code to deal with the error
    }
}

That follows the Java code style convention. Read the full guide here: http://java.sun.com/docs/codeconv/html/CodeConvTOC.doc.html

Pyrolistical
A: 

Well, in this case, you're instantiating an Integer primitive every single time you say i = Integer.parseInt(str) (where i is an Integer), so (unless Java knows how to optimize it), both cases are almost equally inefficient. Consider using int instead:

int total = 0;
for (String str : string_list)
{
    int i = Integer.parseInt(str);
    total += i;
}

Now we're back to the question of whether to put the int declaration on the inside or outside. Assuming the Java compiler has a lick of decent optimization, I'd say it doesn't matter. Efficiency aside, it is considered good practice to declare variables as close as possible to their use.

Joey Adams
thanks, but the question is *does* the Java compiler have a lick of decent optimization?
Simon
A: 

Second one since you want to keep the scope of your variables as "inner" as possible. The advantage of smaller scope is less chance for collision. In your example, there's only a few lines so the advantage might not be so obvious. But if it's larger, having the smaller-scope variables definitely is more beneficial. If someone else later has to look at the code, they would have to scan all the way back to right outside the method definition to know what i is. The argument is not much different than that of why we want to avoid global variable.

Khnle
this is a stylistic comment, I was looking for reasons based on language and resources rather than maintainability etc.
Simon
A: 

The second one is the preferable of the two for readability, maintainability, and efficiency.

All three of these goals are achieved because you are succinctly explaining what you are doing and how your variables are being used. You are explaining this clearly to both developers and the compiler. When the variable i is defined in the for block everyone knows that it is safe to ignore it outside of the block and that the value is only valid for this iteration of the block. This will lead the to the Garbage Collector being able to be able to easily mark this memory to be freed.

I would suggest not using Integer for intermediate values. Accumulate the total as an int and after the loop create the Object or depend on auto-boxing.

Jacob Tomaw
readability and maintainability are stylistic and you haven't justified efficiency in the rest of your answer.
Simon
A: 

Assuming you have positive numbers in your list and you're serious with

I am looking for reasons based on resource utilisation, compiler performance, GC performance etc. rather than elegance.

You should implement it by yourself like:

import java.util.ArrayList;
import java.util.List;

public class Int {
    public static void main(String[] args) {
        List<String> list = new ArrayList<String>();
        list.add("10");
        list.add("20");
        int total = 0;
        for (String str : list) {
            int val = 0;
            for (char c : str.toCharArray()) {
                val = val * 10 + (int) c - 48;
            }
            total += val;
        }
        System.out.print(total);
    }
}

The only GC relevant thing would be toCharArray() which could be replaced by another loop using charAt()

stacker
A: 

The question of what variable scope to use is a readability issue more than anything else. The code is better understood when every variable is restricted to the scope where it is actually used.

Now, if we inspect the technical consequences of using wide/narrow scopes, I believe that there IS a performance/footpring advantage with narrow scopes. Consider the following method, where we have 3 local variables, belonging to one global scope:

private static Random rnd = new Random();

public static void test() {
    int x = rnd.nextInt();
    System.out.println(x);

    int y = rnd.nextInt();
    System.out.println(y);

    int z = rnd.nextInt();
    System.out.println(z);      
}

If you diassemble this code (using javap -c -verbose {class name} for example), you will see that the compiler reserves 3 slots for local variables in the stack frame structure of the test() method.

Now, suppose that we add some artificial scopes:

public static void test() {
    {
        int x = rnd.nextInt();
        System.out.println(x);
    }

    {
        int y = rnd.nextInt();
        System.out.println(y);
    }

    {
        int z = rnd.nextInt();
        System.out.println(z);
    }
}

If you diassemble the code now, you will notice that the compiler reserves only 1 slot for local variables. Since the scopes are completely independent, each time x,y or z are used, the same slot #0 is used.

What does it mean?

1) Narrow scopes save stack space

2) If we are dealing with object variables, it means that the objects may become unreachable faster, therefore are eligible for GC sooner than otherwise.

Again,note that these 2 "advantages" are really minor, and the readability concern should be by far the most important concern.

Eyal Schneider