views:

179

answers:

2

This is a question that arose mostly of pure curiosity (and killing some time). I'm asking specifically about Java for the sake of concreteness.

What happens, in memory, if I concatenate a string (any string) with an empty string, e.g.:

String s = "any old string";
s += "";

I know that afterward, the contents of s will still be "any old string", since an empty ASCII string is stored in memory as just an ASCII null (since, at least in Java, strings are always null-terminated). But I am curious to know if Java (the compiler? the VM?) performs enough optimization to know that s will be unchanged, and it can just completely omit that instruction in the bytecode, or if something different happens at compile and run times.

+2  A: 

You'll get a new String after executing the line

s += "";

Java allocates a new String object and assigns it to s after the string concatenation. If you have eclipse handy (and I assume you can do the same thing in NetBeans, but I've only ever used eclipse) you can breakpoint that line and watch the object IDs of the object that s points to before and after executing that line. In my case, the object ID of s before that line of code was id=20, and afterward was id=24.

Peter Nix
+12  A: 

It's bytecode time!

class EmptyString {
    public static void main(String[] args) {
        String s = "any old string";
        s += "";
    }
}

javap -c EmptyString:

Compiled from "EmptyString.java"
class EmptyString extends java.lang.Object{
EmptyString();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   ldc     #2; //String any old string
   2:   astore_1
   3:   new     #3; //class java/lang/StringBuilder
   6:   dup
   7:   invokespecial   #4; //Method java/lang/StringBuilder."":()V
   10:  aload_1
   11:  invokevirtual   #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   14:  ldc     #6; //String
   16:  invokevirtual   #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   19:  invokevirtual   #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   22:  astore_1
   23:  return

}

You can see that += causes a StringBuilder to be created regardless of what it's concatenating, so it can't be optimized at runtime.

On the other hand, if you put both String literals in the same expression, they are concatenated by the compiler:

class EmptyString {
    public static void main(String[] args) {
        String s = "any old string" + "";
    }
}

javap -c EmptyString:

Compiled from "EmptyString.java"
class EmptyString extends java.lang.Object{
EmptyString();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   ldc     #2; //String any old string
   2:   astore_1
   3:   return

}
Michael Myers
+1 -- for greatjustice
windfinder
cool. thank you!
Matt Ball
@mmyers: you should point out that 1) the emitted bytecodes are (in theory) Java compiler specific, and 2) the JIT compiler could (in theory) optimize further.
Stephen C
True, and the JLS does give some leeway for this (http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#15.18.1).
Michael Myers