views:

378

answers:

8

Three of my coworkers just told me that there's no reason to use a StringBuilder in place of concatenation using the + operator. In other words, this is fine to do with a bunch of strings: myString1 + myString2 + myString3 + myString4 + mySt...

The rationale that they used was that since .NET 2, the C# compiler will build the same IL if you use the + operator as if you used a StringBuilder.

This is news to me. Are they correct?

+20  A: 

No, they are not correct. String concatenation creates a new string whereas StringBuilder uses a variable size buffer to build the string, only creating a string object when ToString() is called.

There are many discussions on string concatenation techniques all over the Internet if you would like to read further on the subject. Most focus on the efficiency of the different methods when used in loops. In that scenario, StringBuilder is faster over string concatenation using string operators for concatenations of 10 or more strings, which should indicate that it must be using a different method than the concatenation.

That said, if you're concatenating constant string values, the string operators will be better because the compiler will factor them away, and if your performing non-looped concatenation, using the operators will be better as they should result in a single call to string.Concat.

Jeff Yates
String concatenation **does not** create a new string for each `+` operation: `a + b + c + d` is transformed to `string.Concat(a, b, c, d)`
dtb
@dtb: edited to remove misleading part
Jeff Yates
@dtdb But it does if you have a loop like `for(i=0;i<n;++i) s = s +a[i];`.
Doc Brown
@Doc, that wasn't what the OP asked, though.
Joey
Edited to cover difference between concatenation in a loop and not in a loop
Jeff Yates
A: 

The is a slight difference between String and StringBuilder:

Concatenating a String will create a new string object being the result of the concatenation. Concatenating a StringBuilder modifies the string object.

So they are not correct.

Simon
+14  A: 

No they are not correct, it won't produce the same IL:

static string StringBuilder()
{
    var s1 = "s1";
    var s2 = "s2";
    var s3 = "s3";
    var s4 = "s4";
    var sb = new StringBuilder();
    sb.Append(s1).Append(s2).Append(s3).Append(s4);
    return sb.ToString();
}

static string Concat()
{
    var s1 = "s1";
    var s2 = "s2";
    var s3 = "s3";
    var s4 = "s4";
    return s1 + s2 + s3 + s4;
}

IL of StringBuilder:

.method private hidebysig static string StringBuilder() cil managed
{
    .maxstack 2
    .locals init (
        [0] string s1,
        [1] string s2,
        [2] string s3,
        [3] string s4,
        [4] class [mscorlib]System.Text.StringBuilder sb)
    L_0000: ldstr "s1"
    L_0005: stloc.0 
    L_0006: ldstr "s2"
    L_000b: stloc.1 
    L_000c: ldstr "s3"
    L_0011: stloc.2 
    L_0012: ldstr "s4"
    L_0017: stloc.3 
    L_0018: newobj instance void [mscorlib]System.Text.StringBuilder::.ctor()
    L_001d: stloc.s sb
    L_001f: ldloc.s sb
    L_0021: ldloc.0 
    L_0022: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    L_0027: ldloc.1 
    L_0028: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    L_002d: ldloc.2 
    L_002e: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    L_0033: ldloc.3 
    L_0034: callvirt instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    L_0039: pop 
    L_003a: ldloc.s sb
    L_003c: callvirt instance string [mscorlib]System.Object::ToString()
    L_0041: ret 
}

IL of Concat:

.method private hidebysig static string Concat() cil managed
{
    .maxstack 4
    .locals init (
        [0] string s1,
        [1] string s2,
        [2] string s3,
        [3] string s4)
    L_0000: ldstr "s1"
    L_0005: stloc.0 
    L_0006: ldstr "s2"
    L_000b: stloc.1 
    L_000c: ldstr "s3"
    L_0011: stloc.2 
    L_0012: ldstr "s4"
    L_0017: stloc.3 
    L_0018: ldloc.0 
    L_0019: ldloc.1 
    L_001a: ldloc.2 
    L_001b: ldloc.3 
    L_001c: call string [mscorlib]System.String::Concat(string, string, string, string)
    L_0021: ret 
}

Also you might find this article interesting.

Darin Dimitrov
The IL is obviously different, but I believe the question is what is `String.Concat` doing internally? Is it using a `StringBuilder`? If so, then a call to a function that uses a `StringBuilder` and returns a string is no different than a call that uses a single call to Concat and returns a string. Differences would arise when *multiple* calls started to be made to Concat. Or am I incorrect?
Anthony Pegram
String.Concat knows the lengths of the strings to be concatenated in advance, so unlike a StringBuilder it can allocate a new string with the right size right away and does not need to allocate a growing buffer and trim the result.
dtb
@Darin Dmitrov: right -- check the IL, you beat me to it! However, if you use static strings with the "+" operator, it the compiler will combine them without calling concat.
JMarsch
If you know the lengths in advance, you pass their sum to the StringBuilder constructor as the capacity, and it won't have to grow its buffer or trim the result. It would be odd for a team to code the same thing twice rather than reusing the implementation, but not unheard of.
Pete Kirkham
+5  A: 

No, they are not. They definitely generate different IL. It's using different calls: String.Concat in the non StringBuilder case.

String.Concat calls a private method called ConcatArray, which allocates a new string just long enough to hold the end result. So, very different, but that doesn't mean concatenating using the + operator is less efficient than using a StringBuilder. In fact, it's almost certainly more efficient. Also, in the case of concatenation of constants, it is done at compile time.

However, when you do concatenation in a loop, the compiler can't do this sort of optimization. In such cases, using StringBuilder would be better for reasonably long strings.

Thorarin
+3  A: 

The answer is that it depends upon how you concatenate. If you use the + operator with static strings, then your friends are correct -- there is no need for a string builder. However, if you use string variables or the += operator, then you are reallocating strings.

The way to really find out what's going on here is to write some code and then decompile it.

Let's build some test code and look at it in Reflector, using the IL view (or you can use ILDASM, whichever you prefer

So first, a baseline -- this method does not concatenate at all:


static void NoConcat()
{
  string test = "Hello World";
}

Now here is the IL:


.method private hidebysig static void NoConcat() cil managed
{
    .maxstack 1
    .locals init (
        [0] string test)
    L_0000: nop 
    L_0001: ldstr "Hello World"  <----------NO reallocation!
    L_0006: stloc.0 
    L_0007: ret 
}

Ok, no surprises, right?

Now lets look at some code that definitely reallocates the string, so we know what that looks like:


static void Concat2()
{
  string test = "Hello";
  test += " ";
  test += "World";
}

Here's the IL, note the reallocations (it calls string.Concat, which causes a new string to be allocated):


.method private hidebysig static void Concat2() cil managed
{
    .maxstack 2
    .locals init (
        [0] string test)
    L_0000: nop 
    L_0001: ldstr "Hello"
    L_0006: stloc.0 
    L_0007: ldloc.0 
    L_0008: ldstr " "
    L_000d: call string [mscorlib]System.String::Concat(string, string)
    L_0012: stloc.0 
    L_0013: ldloc.0 
    L_0014: ldstr "World"
    L_0019: call string [mscorlib]System.String::Concat(string, string)
    L_001e: stloc.0 
    L_001f: ret 
}

Ok, now how about a concatenation that doesn't cause a reallocation -- we are going to concatenate static strings with teh "+" operator:


static void Concat1()
{
  string test = "Hello" + " " + "World";
}

Here's the IL -- look how smart the compiler is! It does NOT use concat -- it's identical to the first example:


.method private hidebysig static void Concat1() cil managed
{
    .maxstack 1
    .locals init (
        [0] string test)
    L_0000: nop 
    L_0001: ldstr "Hello World"
    L_0006: stloc.0 
    L_0007: ret 
}

Now lets have a little fun. What if we mix static strings and variables? (this is where you may still be better off using a stringbuilder)


static void Concat3(string text)
{
  string test = "Hello" + " " + text + " World";
}

And the IL. Note that it was smart enough to combine the "Hello" and the " " as a constant, but it still has to do a concat for the text variable:


.method private hidebysig static void Concat3(string text) cil managed
{
    .maxstack 3
    .locals init (
        [0] string test)
    L_0000: nop 
    L_0001: ldstr "Hello "
    L_0006: ldarg.0 
    L_0007: ldstr " World"
    L_000c: call string [mscorlib]System.String::Concat(string, string, string)
    L_0011: stloc.0 
    L_0012: ret 
}

JMarsch
+1  A: 

I usually follow the following rules:

  1. If the number of child strings are foreknown, use concatenation. This covers the situation like str1 + str2 + str3 +..., no matter how many they are.

  2. If child strings are already in an array, use string.join

  3. If building string in a loop, use StringBuilder

Codism
Use `String.Concat` instead of `String.Join` :)
Thorarin
@Thorain `String.Join` is the most performant of all .NET string combination functions, you can google around to find the benchmarking results that show this. Generally however string.Join isn't friendly to work with.
Chris Marisic
@Chris: Okay, benchmark seemed to confirm it. That's some *EPIC* failure in the `String.Concat` implementation then.
Thorarin
string.concat internally uses string.join
Codism
A: 

There's a HUGE performance difference between string concatenation and StringBuidler. We had a web service that was too slow. We changed all the string cats to StringBuilder.Appends and it got a lot faster!

LSpencer777
That makes me wonder why you're doing so awfully many string concatenations in the first place...
Thorarin
A: 

No, string concatenation does not use StringBuilder internally. However, in your particular example, there is no advantage to using StringBuilder.

This is fine for a few strings (you're only creating one new string):

myString = myString + myString2 + myString3 + myString4 + mySt...

This is not (you're creating and allocating 4 strings, etc):

myString = myString + myString2;
myString = myString + myString3;
myString = myString + myString4;
myString = myString + myString5;

Out of all the stackoverflow questions about this matter, this has one of the best answers: http://stackoverflow.com/questions/73883/string-vs-stringbuilder

Look for two answers, the one by Jay Bazuzi and the one by James Curran.

Also, HIGHLY RECOMMENDED, Jeff Atwood uses actual testing to compare these and other scenarios of string concatenation/building, here: http://www.codinghorror.com/blog/2009/01/the-sad-tragedy-of-micro-optimization-theater.html

Jeremy