views:

3523

answers:

9

I was wondering about StringBuilder and I've got a question that I was hoping the community would be able to explain.

Let's just forget about code readability, which of these is faster and why?

StringBuilder.Append:

StringBuilder sb = new StringBuilder();
sb.Append(string1);
sb.Append("----");
sb.Append(string2);

StringBuilder.AppendFormat:

StringBuilder sb = new StringBuilder();
sb.AppendFormat("{0}----{1}",string1,string2);
+1  A: 

Of course profile to know for sure in each case.

That said, I think in general it will be the former because you aren't repeatedly parsing the format string.

However, the difference would be very small. To the point that you really should consider using AppendFormat in most cases anyway.

Joel Coehoorn
A: 

I'd assume it was the call that did the least amount of work. Append just concatenates strings, where AppendFormat is doing string substitutions. Of course these days, you never can tell...

Paul W Homer
-1, Append does not concatenate strings. It adds to its internal character array.
Samuel
Sorry, it concatenates characters ...
Paul W Homer
A: 

1 should be faster becuase it's simply appending the strings whereas 2 has to create a string based on a format and then append the string. So there's an extra step in there.

Micah
+4  A: 

Append will be faster in most cases because there are many overloads to that method that allow the compiler to call the correct method. Since you are using Strings the StringBuilder can use the String overload for Append.

AppendFormat takes a String and then an Object[] which means that the format will have to be parsed and each Object in the array will have to be ToString'd before it can be added to the StringBuilder's internal array.

Note: To casperOne's point - it is difficult to give an exact answer without more data.

Andrew Hare
+21  A: 

It's impossible to say, not knowing the size of string1 and string2.

With the call to AppendFormat, it will preallocate the buffer just once given the length of the format string and the strings that will be inserted and then concatenate everything and insert it into the buffer. For very large strings, this will be advantageous over separate calls to Append which might cause the buffer to expand multiple times.

However, the three calls to Append might or might not trigger growth of the buffer and that check is performed each call. If the strings are small enough and no buffer expansion is triggered, then it will be faster than the call to AppendFormat because it won't have to parse the format string to figure out where to do the replacements.

More data is needed for a definitive answer

It should be noted that there is little discussion of using the static Concat method on the String class (Jon's answer using AppendWithCapacity reminded me of this). His test results show that to be the best case (assuming you don't have to take advantage of specific format specifier). String.Concat does the same thing in that it will predetermine the length of the strings to concatenate and preallocate the buffer (with slightly more overhead due to looping constructs through the parameters). It's performance is going to be comparable to Jon's AppendWithCapacity method.

Or, just the plain addition operator, since it compiles to a call to String.Concat anyways, with the caveat that all of the additions are in the same expression:

// One call to String.Concat.
string result = a + b + c;

NOT

// Two calls to String.Concat.
string result = a + b;
result = result + c;


For all those putting up test code

You need to run your test cases in separate runs (or at the least, perform a GC between the measuring of separate test runs). The reason for this is that if you do say, 1,000,000 runs, creating a new StringBuilder in each iteration of the loop for one test, and then you run the next test that loops the same number of times, creating an additional 1,000,000 StringBuilders, the GC will more than likely step in during the second test and hinder its timing.

casperOne
+1 Good point about needing more data. I think most of us fired off a definitive answer without considering the fact that different inputs would be significant.
Andrew Hare
+1 Great point. If the examples were changed to preallocate space for the strings via the Capacity property, then the difference would be only the formatting of the string.
Kent Boogaart
@casperOne: You can call GC.Collect() between runs of course. It won't be *quite* the same, but pretty close.
Jon Skeet
@Jon Skeet: Updated answer accordingly.
casperOne
Now that I've added the "AppendWithCapacity" test in my answer, I suspect that it would be hard to construct examples where that wouldn't be the fastest, except for empty strings.
Jon Skeet
@Jon Skeet: No need to do that, String.Concat does the same exact thing.
casperOne
@Jon Skeet: Updated to reflect String.Concat.
casperOne
@casperOne: No, String.Concat doesn't do the same *exact* thing - you end up with a string, not a StringBuilder. I think it's highly likely that in real code you're likely to want to do more appends before or afterwards. (cont)
Jon Skeet
The difference between "result is a string" and "result is a StringBuilder" is potentially very significant. I agree that if you just want a string you should just use a + "----" + b though.
Jon Skeet
@Jon Skeet: If having a StringBuilder to do more with it is the desired outcome (because it's passed around), I agree; but for a fixed set of known inputs (which is a very common case as well) String.Concat is the better choice.
casperOne
A: 

Faster is 1 in your case however it isn't a fair comparison. You should ask StringBuilder.AppendFormat() vs StringBuilder.Append(string.Format()) - where the first one is faster due to internal working with char array.

Your second option is more readable though.

Miha Markic
string.Format() creates a StringBuilder object internally, so StringBuilder.AppendFormat() is basically the same as string.Format()
Sergio
While this is (probably) true, there are two steps involved when doing Append(string.Format()) - first, the Format and then its content has to be copied to StringBuilder's content. When doing AppendFormat there is only one step.
Miha Markic
And that is why i never considered the "StringBuilder.Append(string.Format())" option in the first place ;)
Sergio
+9  A: 

casperOne is correct. Once you reach a certain threshold, the Append() method becomes slower than AppendFormat(). Here are the different lengths and elapsed ticks of 100,000 iterations of each method:

Length: 1

Append()       - 50900
AppendFormat() - 126826

Length: 1000

Append()       - 1241938
AppendFormat() - 1337396

Length: 10,000

Append()       - 12482051
AppendFormat() - 12740862

Length: 20,000

Append()       - 61029875
AppendFormat() - 60483914

When strings with a length near 20,000 are introduced, the AppendFormat() function will slightly outperform Append().

Why does this happen? See casperOne's answer.

Edit:

I reran each test individually under Release configuration and updated the results.

John Rasch
+1: Actually testing!
Richard
Could you post the code? I'd like to test with a preset capacity, but don't want to reinvwent your wheel.
Jon Skeet
http://pastebin.com/m1d0c1b47
John Rasch
+8  A: 

casperOne is entirely accurate that it depends on the data. However, suppose you're writing this as a class library for 3rd parties to consume - which would you use?

One option would be to get the best of both worlds - work out how much data you're actually going to have to append, and then use StringBuilder.EnsureCapacity to make sure we only need a single buffer resize.

If I weren't too bothered though, I'd use Append x3 - it seems "more likely" to be faster, as parsing the string format tokens on every call is clearly make-work.

Note that I've asked the BCL team for a sort of "cached formatter" which we could create using a format string and then re-use repeatedly. It's crazy that the framework has to parse the format string each time it's used.

EDIT: Okay, I've edited John's code somewhat for flexibility and added an "AppendWithCapacity" which just works out the necessary capacity first. Here are the results for the different lengths - for length 1 I used 1,000,000 iterations; for all other lengths I used 100,000. (This was just to get sensible running times.) All times are in millis.

Unfortunately tables don't really work in SO. The lengths were 1, 1000, 10000, 20000

Times:

  • Append: 162, 475, 7997, 17970
  • AppendFormat: 392, 499, 8541, 18993
  • AppendWithCapacity: 139, 189, 1558, 3085

So as it happened, I never saw AppendFormat beat Append - but I did see AppendWithCapacity win by a very substantial margin.

Here's the full code:

using System;
using System.Diagnostics;
using System.Text;

public class StringBuilderTest
{            
    static void Append(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendWithCapacity(string string1, string string2)
    {
        int capacity = string1.Length + string2.Length + 4;
        StringBuilder sb = new StringBuilder(capacity);
        sb.Append(string1);
        sb.Append("----");
        sb.Append(string2);
    }

    static void AppendFormat(string string1, string string2)
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendFormat("{0}----{1}", string1, string2);
    }

    static void Main(string[] args)
    {
        int size = int.Parse(args[0]);
        int iterations = int.Parse(args[1]);
        string method = args[2];

        Action<string,string> action;
        switch (method)
        {
            case "Append": action = Append; break;
            case "AppendWithCapacity": action = AppendWithCapacity; break;
            case "AppendFormat": action = AppendFormat; break;
            default: throw new ArgumentException();
        }

        string string1 = new string('x', size);
        string string2 = new string('y', size);

        // Make sure it's JITted
        action(string1, string2);
        GC.Collect();

        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < iterations; i++)
        {
            action(string1, string2);
        }
        sw.Stop();
        Console.WriteLine("Time: {0}ms", (int) sw.ElapsedMilliseconds);
    }
}
Jon Skeet
@Jon Skeet: I haven't looked at it, but might a compiled Regex be a solution to the cached formatter idea? I would hope it is smart enough to pre-allocate the output, and it would prevent parsing of the format string every time.
casperOne
The regex would only do parsing, not formatting as far as I can see. Btw, in the above when considering EnsureCapacity I'd ignored the fact that the code creates the StringBuilder itself. Just pass the required capacity into the constructor.
Jon Skeet
@Jon Skeet: It wouldn't do parsing, but that's half the issue no? If you are doing straight up replacement, it's not much of an issue. However, you don't get the effect of being able to store the format strings when using the Regex (especially if you want to format the arguments).
casperOne
@Jon - code here: http://pastebin.com/m1d0c1b47
John Rasch
@casperOne: I still don't see how you'd do the replacement. But without the ability to do format strings, it's really not equivalent. A StringFormatter type would definitely be handy.
Jon Skeet
@Jon: I remember you mentioned your 'cached format' idea a while back, and I've been thinking about implementing my own. Of course, to be really useful it would need to _exactly_ match the normal String.Format() parsing, and I can't use reflector at work :(
Joel Coehoorn
@Joel: Yes, that's the tricky bit - getting it exactly right. It's definitely something that the BCL team should be working on. Fortunately one of the BCL team read that blog post and liked the idea, so maybe one day...
Jon Skeet
@Jon Skeet: Your "AppendWithCapacity" is really just "String.Concat", so no need for that extra messy code.
casperOne
@casperOne: I was assuming that the point was to end up with a StringBuilder though, so we could append more to it. I suspect in real life the StringBuilder is passed in with some data in it already - in which case EnsureCapacity() is the equivalent answer, basically. String.Concat doesn't help then
Jon Skeet
+1  A: 

StringBuilder also has cascaded appends: Append() returns the StringBuilder itself, so you can write your code like this:

StringBuilder sb = new StringBuilder();
sb.Append(string1)
  .Append("----")
  .Append(string2);

Clean, and it generates less IL-code (although that's really a micro-optimization).

Tommy Carlier