views:

712

answers:

11

If I am building a string using a StringBuilder object in a method, would it make sense to:

Return the StringBuilder object, and let the calling code call ToString()?

return sb;

OR Return the string by calling ToString() myself.

return sb.ToString();

I guess it make a difference if we're returning small, or large strings. What would be appropriate in each case? Thanks in advance.

Edit: I don't plan on further modifying the string in the calling code, but good point Colin Burnett.

Mainly, is it more efficient to return the StringBuilder object, or the string? Would a reference to the string get returned, or a copy?

+15  A: 

Return the StringBuilder if you're going to further modify the string, otherwise return the string. This is an API question.

Regarding efficiency. Since this is a vague/general question without any specifics then I think mutable vs. immutable is more important than performance. Mutability is an API issue of letting your API return modifiable objects. String length is irrelevant to this.

That said. If you look at StringBuilder.ToString with Reflector:

public override string ToString()
{
    string stringValue = this.m_StringValue;
    if (this.m_currentThread != Thread.InternalGetCurrentThread())
    {
        return string.InternalCopy(stringValue);
    }
    if ((2 * stringValue.Length) < stringValue.ArrayLength)
    {
        return string.InternalCopy(stringValue);
    }
    stringValue.ClearPostNullChar();
    this.m_currentThread = IntPtr.Zero;
    return stringValue;
}

You can see it may make a copy but if you modify it with the StringBuilder then it will make a copy then (this is what I can tell the point of m_currentThread is because Append checks this and will copy it if it mismatches the current thread).

I guess the end of this is that if you do not modify the StringBuilder then you do not copy the string and length is irrelevant to efficiency (unless you hit that 2nd if).

UPDATE

System.String is a class which means it is a reference type (as opposed to value type) so "string foo;" is essentially a pointer. (When you pass a string into a method it passes the pointer, not a copy.) System.String is mutable inside mscorlib but immutable outside of it which is how StringBuilder can manipulate a string.

So when ToString() is called it returns its internal string object by reference. At this point you cannot modify it because your code is not in mscorlib. By setting the m_currentThread field to zero then any further operations on the StringBuilder will cause it to copy the string object so it can be modified and not modify the string object it returned in ToString(). Consider this:

StringBuilder sb = new StringBuilder();
sb.Append("Hello ");

string foo = sb.ToString();

sb.Append("World");

string bar = sb.ToString();

If StringBuilder did not make a copy then at the end foo would be "Hello World" because the StringBuilder modified it. But since it did make a copy then foo is still just "Hello " and bar is "Hello World".

Does that clarify the whole return/reference thing?

Colin Burnett
@SkippyFire is asking about efficiency.
bruno conde
-1. If your going to further modify the string it should be further modified within the method. What if this method is called from multiple places and the string manipulation logic changes? It would not be good to have update the logic in multiple places
Nick Allen - Tungle139
Nick, you're splitting hairs. If this is in a class and a private method you may very well pass the StringBuilder to many methods to build up the final string. You could do the same thing where each method returns a string and concatenates them. SkippyFire was not specific at all in how this is used.It's pretty simple: if you need mutable then return mutable, if you don't need mutable then don't return mutable.
Colin Burnett
Bruno, and how does my answer not address efficiency. Again: if you need mutable then return mutable. If you don't then don't return mutable.
Colin Burnett
In his defense, it would depend on how you're using it. If this method creates the string from scratch, like building base SQL code, then it might make sense to allow the calling code to easily append to it, say an order by clause. However, let's say that you're STARTING with a String Builder being passed in, then it might not make sense? I think that totally depends on the context. Thanks for the comment though!
SkippyFire
Reflector is awesome. It directly answers your question about efficiency by showing that if StringBuilder is not modified then you won't be making a copy (assuming you haven't truncated it and hit that 2nd if).
Colin Burnett
I think you're making this more complicated than it needs to be? So... returning the StringBuilder object returns a reference, which should be fast and easy. But what happens if you call .ToString() in the return statement? What exactly gets returned, and how? I think I'm having a little trouble understanding how the return "system" works exactly.So does the actual string get returned, or does a copy of the string get returned? Like when you pass a string into a method as a parameter, passing a string will create a copy of the string for use in that method, right?
SkippyFire
SkippyFire, see update. Does that clarify anything?
Colin Burnett
@SkippyFire, returning a string will return a reference to a string in the heap, not a copy. They're immutable, however, so modifying the string will result in an entirely new string being created on the heap, and a new reference created. You should be able to safely treat them like value types, even though they're not. System.String is implemented to give you the best of both worlds.
Michael Meadows
Ah I see now. I understood how strings work, just not how returning them worked. It makes total sense now. Thanks!
SkippyFire
+5  A: 

I don't think performance should be a factor in this question. Either way someone is going to call sb.ToString() so your going to take the hit someplace.

The more important question is what is the intention of the method and the purpose. If this method is part of a builder you might return the string builder. Otherwise I would return a string.

If this is part of a public API I would lean towards returning a string instead of the builder.

JoshBerke
The efficiency question, then, is if you are going to modify the string more then it would be better to have it already in a mutable form (i.e., StringBuilder) to avoid copying into a new StringBuilder to modify it. Just because somewhere down the line you will call ToString doesn't mean you don't want to avoid intermediate copying.
Colin Burnett
@Colin Burnett, in terms of performance, if mutability is required, returning StringBuilder is a practical solution, but it is not a good one. It is preferable to rewrite the caller to support a pattern that builds the object completely and defers the building of the string to a single call. There are two drawbacks to returning StringBuilder, first, it couples you to an implementation detail. Second, it shies away from enabling a good OOP approach, which will hinder the solution's viability as an API (and lock you to a procedural implementation).
Michael Meadows
Michael, my only point in that comment was that "someone is going to call sb.ToString()" is missing the point of intermediate copying which will affect performance. Sometimes, yes, you may well want to take the hit for sake of having a better API or be motivated to find a solution other than "string or StringBuilder".
Colin Burnett
@Colin, I would argue that you'll find it very rare where strings cause performance problems, and certainly you should not let that fear push you prematurely to a design that violates the principles of object oriented design (high cohesion and low coupling). Wait for the profiler to tell you it's a problem, then cross the performance bridge.
Michael Meadows
Michael, I understand that, but that doesn't preclude the idea being explored, does it?
Colin Burnett
@Colin, like I said above, it's a practical solution, but you should be very careful to limit the impact of that practical decision. High coupling (if unmitigated) is like malignant cancer; if you're not careful, it can spread out and create maintainability problems throughout your system.
Michael Meadows
+2  A: 

I would say the method should return sb.ToString(). If the logic surrounding the creation of the StringBuilder() object should change in the future it makes sense to me that it be changed in the method not in each scenario which calls the method and then goes on to do something else

Nick Allen - Tungle139
A: 

If you need to append more stuff to the string and use other stringbuilder related functionality, return the stringbuilder. Otherwise if you're just using the string itself, return the string.

There are other more technical considerations, but thats the highest level concerns.

Allen
+1  A: 

It depends on what you are planning on doing with the output. I would return a string personally. That way if you need to change the method down the road to not use a stringbuilder, you can as you won't be stuck with that as a return value.

Upon thinking about it for the moment, the answer is much clearer. The question of asking which should be returned really answers the question. The return object should be a string. The reason is that if you are asking the question, "Is there a reason to return the StringBuilder object when a string will do?" then the answer is no. If there was a reason, then returning string would be out of the question, because the methods and properties of the stringbuilder are needed.

Kevin
+1  A: 

I think it depends what you are doing with the string once it leaves the method. If you are going to continue appending to it then you might want to consider returning a stringbuilder for greater efficiency. If you are always going to call .ToString() on it then you should do that inside the method for better encapsulation.

Steve Willcock
+1  A: 

I would return a string in almost all situations, especially if the method is part of a public API.

An exception would be if your method is just one part of a larger, private "builder" process and the calling code will be doing further manipulations. In that case then I'd maybe consider returning a StringBuilder.

LukeH
+1  A: 

Since your not going to modify it anymore

return sb.ToString();

should be most efficient

Fredrik Leijon
+1  A: 

Return the sb.ToString(). Your method should concentrate on just the thing in hand (In this case build me a string) and not be returned to be further manipulated IMO, you could get into all sorts of problems with it not being disposed.

Geoff Davis
+3  A: 

StringBuilder is an implementation detail of your method. You should return string until it becomes a performance problem, at which point you should explore another pattern (like visitor Pattern) that can help you introduce indirection and protect you from the internal implementation decisions.

Strings are always stored in the heap, so you will have a reference returned if the return type is string. You can't count, however, on two identical strings to have identical references. In general, it's safe to think of a string as if it were a value type even though it is actually a reference type.

Michael Meadows
+1 for implementation detail
Patrick McDonald
A: 

The method was given a concrete task and it should be expected to complete it and return the finished result that does not require further processing. Only return StringBuilder when you really need it. In that case also add something to the method name to indicate that you are returning something special.

User