views:

383

answers:

11

We have a huge code base and we suspect that there are quite a few "+" based string concats in the code that might benefit from the use of StringBuilder/StringBuffer. Is there an effective way or existing tools to search for these, especially in Eclipse?

A search by "+" isn't a good idea since there's a lot of math in the code, so this needs to be something that actually analyzes the code and types to figure out which additions involve strings.

+13  A: 

I'm pretty sure FindBugs can detect these. If not, it's still extremely useful to have around.

Edit: It can indeed find concatenations in a loop, which is the only time it really makes a difference.

Michael Myers
+1 for concatenations in a loop. That's the thing that would scare me.
ojrac
I was mostly worried about loops, so if findbugs does that, I'll definitely give it a look. I've used findbugs before on other projects, but never cared about performance, just bugs. Thanks!
Uri
Yes, there are several categories in FindBugs that are only for performance. For instance, there's "Method allocates a boxed primitive just to call toString", "Explicit garbage collection; extremely dubious except in benchmarking code", and "Inefficient use of keySet iterator instead of entrySet iterator".
Michael Myers
+10  A: 

Why not use a profiler to find the "naive" string concatenations that actually matter? Only switch over to the more verbose StringBuffer if you actually need it.

nsanders
+1. String "+" is efficient enough in most cases.
skaffman
In most of those cases, switching to StringBuilder is all that is needed.
Fredrik
+2  A: 

IntelliJ can find these using "structural search". You search for "$a + $b" and set the characteristics of both $a and $b as type java.lang.String.

However, if you have IntelliJ, it likely has a built in inspection that will do a better job of finding what you want anyway.

Darron
A: 
erickson
+1  A: 

Instead of searching for just a + search for "+ and +" those will find the vast majority probably. cases where you are concatenating multiple variables will be tougher.

Gandalf
and most of the time you would be adding spaces in between string variables, so this would still catch most things.
Mike Cooper
OR Use regular expressions that account for optional spaces. "\"\s?+" "+\s?\""
Chris Nava
+12  A: 

Just make sure you really understand where it's actually better to use StringBuilder. I'm not saying you don't know, but there are certainly plenty of people who would take code like this:

String foo = "Your age is: " + getAge();

and turn it into:

StringBuilder builder = new StringBuilder("Your age is: ");
builder.append(getAge());
String foo = builder.toString();

which is just a less readable version of the same thing. Often the naive solution is the best solution. Likewise some people worry about:

String x = "long line" + 
    "another long line";

when actually that concatenation is performed at compile-time.

As nsander's quite rightly said, find out if you've got a problem first...

Jon Skeet
Exactly so. That's why FindBugs only checks loops.
Michael Myers
not only that, but the compiler (at least javac) will turn most of those concatentations into StringBuffer/StringBuilder.append() for you...
matt b
Of course... But it's a good note for posterity. I am mostly worried about situations where there is a programmatic chain of concatenations that cannot be optimized at compile-time (such as loops).
Uri
+2  A: 

I suggest using a profiler. This is really a performance question and if you can't make the code show up with reasonable test data there is unlikely to be any value in changing it.

Peter Lawrey
A: 

Forget it - your JVM most likely does it already - see the JLS, 15.18.1.2 Optimization of String Concatenation:

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

Robert Munteanu
It can't do this in a loop, though. I.e., if you're building up a string by looping through something and using +=, a new StringBuffer/Builder will be created on each iteration.
Michael Myers
Good point -it's all _may_ in the JLS so I have no evidence. I'll have to take a look at the bytecode when I have the time.
Robert Munteanu
+1  A: 

Jon Skeet (as always) and the others have already said all that is needed but I would really like to emphasize that maybe you are hunting for a non existing performance improvement...

Take a look at this code:

public class StringBuilding {
  public static void main(String args[]) {
    String a = "The first part";
    String b = "The second part";
    String res = a+b;

    System.gc(); // Inserted to make it easier to see "before" and "after" below

    res = new StringBuilder().append(a).append(b).toString();
  }
}

If you compile it and disassemble it with javap, this is what you get.

public static void main(java.lang.String[]);
  Code:
   0:   ldc     #2; //String The first part
   2:   astore_1
   3:   ldc     #3; //String The second part
   5:   astore_2
   6:   new     #4; //class java/lang/StringBuilder
   9:   dup
   10:  invokespecial   #5; //Method java/lang/StringBuilder."<init>":()V
   13:  aload_1
   14:  invokevirtual   #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   17:  aload_2
   18:  invokevirtual   #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   21:  invokevirtual   #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   24:  astore_3
   25:  invokestatic    #8; //Method java/lang/System.gc:()V
   28:  new     #4; //class java/lang/StringBuilder
   31:  dup
   32:  invokespecial   #5; //Method java/lang/StringBuilder."<init>":()V
   35:  aload_1
   36:  invokevirtual   #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   39:  aload_2
   40:  invokevirtual   #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   43:  invokevirtual   #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   46:  astore_3
   47:  return

As you can see, 6-21 are pretty much identical to 28-43. Not much of an optimization, right?

Edit: The loop issue is valid though...

Fredrik
+1  A: 

If you have a huge code base you probably have lots of hotspots, which may or may not involve "+" concatenation. Just run your usual profiler, and fix the big ones, regardless of what kind of construct they are.

It would be an odd approach to fix just one class of (potential) bottleneck, rather than fixing the actual bottlenecks.

Ken
+2  A: 

Chances are you will make your performance worse and your code less readable. The compiler already makes this optimization, and unless you are in a loop, it will generally do a better job. Furthermore, in JDK 8 they may come out with StringUberBuilder, and all your code which uses StringBuilder will run slower, while the "+" concatenated strings will benefit from the new class.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” - Donald Knuth

brianegge