views:

683

answers:

4

StringBuilder has a reputation as being a faster string manipulation tool than simply concatenating strings. Whether or not that's true, I'm left wondering about the results of StringBuilder operations and the strings they produce.

A quick jaunt into Reflector shows that StringBuilder.ToString() doesn't always return a copy, sometimes it seems to return an instance of the internal string. It also seems to use some internal functions to manipulate the internal strings.

So what do I get if I do this?

string s = "Yo Ho Ho";
StringBuilder sb = new StringBuilder(s);
string newString = sb.ToString();
sb.Append(" and a bottle of rum.");
string newNewString = sb.ToString();

Are newString and newNewString different string instances or the same? I've tried to figure this out via reflector, and I'm just not quite understanding everything.

How about this code?

StringBuilder sb = new StringBuilder("Foo\n");
StringReader sr = new StringReader(sb.ToString());
string s = sr.ReadLine();
sb.Append("Bar\n");
s = sr.ReadLine();

Will the last statement return null or "Bar"? And if it returns one or ther other, is this defined or undefined behavior? In other words, can I rely on it?

The documentation is remarkably terse on this subject, and I'm reluctant to rely on observed behavior over specification.

+10  A: 

Outside of mscorlib, any instance of a System.String is immutable, period.

StringBuilder does some interesting manipulation of Strings internally but at the end of the day it won't return a string to you and then subsequently mutate it in a way that is visible to your code.

As to whether subsequent calls to StringBuilder.ToString() returns the same instance of a String or a different String with the same value, that is implementation dependent and you should not rely on this behavior.

JaredPar
Yes, experimentation has shown this to be true, but spelunking in the internals with Reflector didn't show an obvious reason why. It appears to return a reference to the internal string on each call, thus my confusion as to whether the string could mutate.
Mystere Man
@Mystere, The implementation of StringBuilder is a bit daunting at first. Until I read your question and started exploring I didn't realize they did thread level caching. I plan on taking a longer look later tonight (or tomorrow). But you can trust in the resulting String being immutable
JaredPar
+4  A: 

newString and newNewString are different string instances.

Although ToString() returns the current string, it clears its current thread variable. That means next time you append, it will take a copy of the current string before appending.

I'm not entirely sure what you're getting at in your second question, but s will be null: if the final characters in a file are the line termination character(s) for the previous line, the line is not deemed to have an empty line between those characters and the end of the file. The string which has been read previously makes no difference to this.

Jon Skeet
Ahh.. you explained the mystery to me. Clearing the thread variable is what causes the copy on append, how simple but non-obvious. My second question was a variation on the mutability question, relating to a reference stored in the StringReader.
Mystere Man
+2  A: 

The whole purpose of this class is to make string mutable, so it actually is. I believe (but not sure) it'll return the same string that goes into it only if nothing else had been done with this object. So after

String s_old = "Foo";
StringBuilder sb = new StringBuilder(s_old);
String s_new = sb.ToString();

s_old would be the same as s_new but it won't be in any other case.

I should note, that for Java compiler automatically convert multiple string additions into operations with StringBuilder (or StringBuffer which is similar but even faster) class and I would be really surprised in .NET compiler doesn't do this conversion also.

vava
The C# compiler doesn't use StringBuilder for string concatenations - it uses String.Concat. That means the final length is known before any concatenation is performed.
Jon Skeet
+3  A: 

Are newString and newNewString different string instances or the same? I've tried to figure this out via reflector, and I'm just not quite understanding everything.

They are different string instances: newString is "Yo Ho Ho" and newNewString is "Yo Ho Ho and a bottle of rum.". strings are immutable, and when you call StringBuilder.ToString() the method returns an immutable string that represents the current state.

Will the last statement return null or "Bar"? And if it returns one or ther other, is this defined or undefined behavior? In other words, can I rely on it?

It will return null. The StringReader is working on the immutable string you passed to it at the constructor, so it is not affected by whatever you do to the StringBuilder.

Zach Scrivena