How the StringBuilder class is implemented? Does it internally create new string objects each time we append?
views:
1061answers:
5Not really - it uses internal character buffer. Only when buffer capacity gets exhausted, it will allocate new buffer. Append operation will simply add to this buffer, string object will be created when ToString() method is called on it - henceforth, its advisable for many string concatenations as each traditional string concat op would create new string. You can also specify initial capacity to string builder if you have rough idea about it to avoid multiple allocations.
Edit: People are pointing out that my understanding is wrong. Please ignore the answer (I rather not delete it - it will stand as a proof of my ignorance :-)
In .NET 2.0 it uses the String
class internally. String
is only immutable outside of the System
namespace, so StringBuilder
can do that.
In .NET 4.0 it was changed to char[]
.
In 2.0 StringBuilder
looked like this
public sealed class StringBuilder : ISerializable
{
// Fields
private const string CapacityField = "Capacity";
internal const int DefaultCapacity = 0x10;
internal IntPtr m_currentThread;
internal int m_MaxCapacity;
internal volatile string m_StringValue; // HERE ----------------------
private const string MaxCapacityField = "m_MaxCapacity";
private const string StringValueField = "m_StringValue";
private const string ThreadIDField = "m_currentThread";
But in 4.0 it looks like this:
public sealed class StringBuilder : ISerializable
{
// Fields
private const string CapacityField = "Capacity";
internal const int DefaultCapacity = 0x10;
internal char[] m_ChunkChars; // HERE --------------------------------
internal int m_ChunkLength;
internal int m_ChunkOffset;
internal StringBuilder m_ChunkPrevious;
internal int m_MaxCapacity;
private const string MaxCapacityField = "m_MaxCapacity";
internal const int MaxChunkSize = 0x1f40;
private const string StringValueField = "m_StringValue";
private const string ThreadIDField = "m_currentThread";
So evidently it was changed from using a string
to using a char[]
.
EDIT: Updated answer to reflect changes in .NET 4 (that I only just discovered).
If I look at .NET Reflector at .NET 2 then I will find this:
public StringBuilder Append(string value)
{
if (value != null)
{
string stringValue = this.m_StringValue;
IntPtr currentThread = Thread.InternalGetCurrentThread();
if (this.m_currentThread != currentThread)
{
stringValue = string.GetStringForStringBuilder(stringValue, stringValue.Capacity);
}
int length = stringValue.Length;
int requiredLength = length + value.Length;
if (this.NeedsAllocation(stringValue, requiredLength))
{
string newString = this.GetNewString(stringValue, requiredLength);
newString.AppendInPlace(value, length);
this.ReplaceString(currentThread, newString);
}
else
{
stringValue.AppendInPlace(value, length);
this.ReplaceString(currentThread, stringValue);
}
}
return this;
}
So it is a mutated string instance...
EDIT Except in .NET 4 it is a char[]
If you want to see one of the possible implementations (That is similar to the one shipped wit the microsoft implementation up to v3.5) you could see the source of the Mono one on github.