views:

1770

answers:

7
String s = "";
for(i=0;i<....){
    s = some Assignment;
}

or

for(i=0;i<..){
    String s = some Assignment;
}

I don't need to use 's' outside the loop ever again. The first option is perhaps better since a new String is not initialized each time. The second however would result in the scope of the variable being limited to the loop itself.

EDIT: In response to Milhous's answer. It'd be pointless to assign the String to a constant within a loop wouldn't it? No, here 'some Assignment' means a changing value got from the list being iterated through.

Also, the question isn't because I'm worried about memory management. Just want to know which is better.

+21  A: 

In theory, it's a waste of resources to declare the string inside the loop. In practice, however, both of the snippets you presented will compile down to the same code (declaration outside the loop).

So, if your compiler does any amount of optimization, there's no difference.

Esteban Araya
The reference is just put inside the stack frame for this method call, right?
Thomas
@Thomas: Correct!
Esteban Araya
I think you misinterpreted what 1800 INFORMATION wrote. The immutability of Java strings is irrelevant here: the "some Assignment" will produce a new String each time, whether or not String is immutable.
Thomas
@Thomas: What if someAssignment is just a hardcoded string?
Esteban Araya
Have updated the Question. Assigning a hard coded string would be illogical.
Vivek Kodira
Which theory is it, which tells you that *declaring* a variable wastes resources?
jrudolph
jrudolph: Creating a new empty string creates a new garbage collected object, if I remember my Java correctly.
TraumaPony
+14  A: 

In general I would choose the second one, because the scope of the 's' variable is limited to the loop. Benefits:

  • This is better for the programmer because you don't have to worry about 's' being used again somewhere later in the function
  • This is better for the compiler because the scope of the variable is smaller, and so it can potentially do more analysis and optimisation
  • This is better for future readers because they won't wonder why the 's' variable is declared outside the loop if it's never used later
Greg Hewgill
Good points! Readability and maintainability are what count, especially when so many are apparently ignorant of how their JVM works anyway.
erickson
+3  A: 

To add on a bit to @Esteban Araya's answer, they will both require the creation of a new string each time through the loop (as the return value of the some Assignment expression). Those strings need to be garbage collected either way.

1800 INFORMATION
I was thinking the same.
Brian R. Bondy
Not necessarily. We don't know what is being assigned to "s" inside the loop. Perhaps its a string constant that is allocated in PermGen space and will never be garbage collected. All we know is that whatever it is, its the same in both cases, so it doesn't matter.
erickson
@erickson: I agree that it could be a string constant, but in that case, I expect the compiler would probably use constant propagation to move s out of the body of the loop. I was assuming it wasn't a constant because a sensible programmer would have done the same.
1800 INFORMATION
I didn't mean the same string constant in every iteration. See my comment on the OP.
erickson
Yes in that case there would be no optimsation
1800 INFORMATION
A: 

It seems to me that we need more specification of the problem.

The

s = some Assignment;

is not specified as to what kind of assignment this is. If the assignment is

s = "" + i + "";

then a new sting needs to be allocated.

but if it is

s = some Constant;

s will merely point to the constants memory location, and thus the first version would be more memory efficient.

Seems i little silly to worry about to much optimization of a for loop for an interpreted lang IMHO.

Milhous
+60  A: 

Limited Scope is Best

Use your second option:

for ( ... ) {
  String s = ...;
}

Scope Doesn't Affect Performance

If you disassemble code the compiled from each (with the JDK's javap tool), you will see that the loop compiles to the exact same JVM instructions in both cases. Note also that Brian R. Bondy's "Option #3" is identical to Option #1. Nothing extra is added or removed from the stack when using the tighter scope, and same data are used on the stack in both cases.

Avoid Premature Initialization

The only difference between the two cases is that, in the first example, the variable s is unnecessarily initialized. This is a separate issue from the location of the variable declaration. This adds two wasted instructions (to load a string constant and store it in a stack frame slot). A good static analysis tool will warn you that you are never reading the value you assign to s, and a good JIT compiler will probably elide it at runtime.

You could fix this simply by using an empty declaration (i.e., String s;), but this is considered bad practice and has another side-effect discussed below.

Often a bogus value like null is assigned to a variable simply to hush a compiler error that a variable is read without being initialized. This error can be taken as a hint that the variable scope is too large, and that it is being declared before it is needed to receive a valid value. Empty declarations force you to consider every code path; don't ignore this valuable warning by assigning a bogus value.

Conserve Stack Slots

As mentioned, while the JVM instructions are the same in both cases, there is a subtle side-effect that makes it best, at a JVM level, to use the most limited scope possible. This is visible in the "local variable table" for the method. Consider what happens if you have multiple loops, with the variables declared in unnecessarily large scope:

void x(String[] strings, Integer[] integers) {
  String s;
  for (int i = 0; i < strings.length; ++i) {
    s = strings[0];
    ...
  }
  Integer n;
  for (int i = 0; i < integers.length; ++i) {
    n = integers[i];
    ...
  }
}

The variables s and n could be declared inside their respective loops, but since they are not, the compiler uses two "slots" in the stack frame. If they were declared inside the loop, the compiler can reuse the same slot, making the stack frame smaller.

What Really Matters

However, most of these issues are immaterial. A good JIT compiler will see that it is not possible to read the initial value you are wastefully assigning, and optimize the assignment away. Saving a slot here or there isn't going to make or break your application.

The important thing is to make your code readable and easy to maintain, and in that respect, using a limited scope is clearly better. The smaller scope a variable has, the easier it is to comprehend how it is used and what impact any changes to the code will have.

erickson
I'm still not fully convinced, but I deleted my answer because it was wrong, but for reference of this question. Here is what I had: {//Don't remove braces String s; for(i=0;i<....){ s = some Assignment; } }
Brian R. Bondy
I'd vote up this comment twice if I could. I'd also tag the question "early optimization".
trenton
What an excellent answer; I'd also vote it up multiple times if I could.
Software Monkey
I hadn't played with javap before, so I checked a for loop that gets the date and sets String s to the date.toString. The disassembly I did, showed that the code is different. You can see that s is being set each loop in the inner one. Is this something that the jit compiler might fix?
Philip T.
@Philip T. - I'm not sure I understand what you are describing. If you mean that the same value was assigned to the `String` on each iteration of the loop, and you are asking whether that calculation can be "hoisted", yes, it's possible, but it will depend on the JVM.
erickson
My question was about the "Scope doesn't affect Performance" section. Since I was curious, I looked at the javap output of a class implementing the example with the variable defined inside and outside the loop. Javap shows that the code is different, with an ldc and astore inside the loop for the inside initialized version and that code is outside the loop for the ooutside initialized version. I am trying to figure out if I am missing something since general consensus seems to be the code shouldn't be different. I checked classes from Eclipse and Netbeans.
Philip T.
Are you talking about the initial, wasted assignment of `""` to `s` in the first example in the original post? The value that is never read before it is overwritten on the first iteration of the loop? That's what I'm talking about in the section "Avoid Premature Initialization"; the `ldc` and `astore` instructions are wasted, because that value can never be read before the variable is re-assigned. The code *inside* the loop is the same. The initialization (assignment) of the variable is a separate issue from the scope of the variable.
erickson
+2  A: 

If you want to speed up for loops, I prefer declaring a max variable next to the counter so that no repeated lookups for the condidtion are needed:

instead of

for (int i = 0; i < array.length; i++) {
  Object next = array[i];
}

I prefer

for (int i = 0, max = array.lenth; i < max; i++) {
  Object next = array[i];
}

Any other things that should be considered have already been mentioned, so just my two cents (see ericksons post)

Greetz, GHad

GHad
A: 

I know this is an old question, but I thought I'd add a bit that is slightly related.

I've noticed while browsing the Java source code that some methods, like String.contentEquals (duplicated below) makes redundant local variables that are merely copies of class variables. I believe that there was a comment somewhere, that implied that accessing local variables is faster than accessing class variables.

In this case "v1" and "v2" are seemingly unnecessary and could be eliminated to simplify the code, but were added to improve performance.

public boolean contentEquals(StringBuffer sb) {
    synchronized(sb) {
        if (count != sb.length())
            return false;
        char v1[] = value;
        char v2[] = sb.getValue();
        int i = offset;
        int j = 0;
        int n = count;
        while (n-- != 0) {
            if (v1[i++] != v2[j++])
                return false;
        }
    }
    return true;
}
Randy Stegbauer
That was probably more useful on old VMs than it is now. I have a hard time believing that makes any difference on HotSpot.
Michael Myers