views:

232

answers:

12

When I need to join two strings I use String.Format (or StringBuilder if it happens in several places in the code).

I see that some good programmers doesn't give attention to strings joining complexity and just use the '+' operator.

I know that using the '+' operator make the application to use more memory, but what about complexity?

+1  A: 

Depends on the situation. The + can sometimes reduce the complexity of the code. Consider the following code:

output = "<p>" + intro + "</p>";

That is a good, clear line. No String.Format required.

Colin Pickard
and I'd still rather have it as output = string.Format("<p>{0}</p>", intro) in case the output has to change. I'm not saying it isn't a good, clear line, but string.Format defends you from future changes a little more.
dnord
+3  A: 

This answer assumes you are talking about the runtime complexity.

Using + creates a new string object, which means the contents of both the old string objects must be copied into the new one. With a large amount of concatenation, such as in a tight loop, this can turn into an O(n^2) operation.

As an informal proof, say you had the following code:

string foo = "a";
for(int i = 0; i < 1000; i++)
{
    foo += "a";
}

The first iteration of the loop, first the contents of foo ("a") are copied into a new string object, then the contents of the literal "a". That's two copies. The second iteration has three copies; two from the new foo, and one from the literal "a". The 1000th iteration will have 1001 copy operations. The total number of copies is 2 + 3 + ... + 1001. In general, if in a loop you are only concatenating one character each iteration (and you start at one character long), if the number of iterations is n, there will be 2 + 3 + ... + n + 1 copies. That's the same as 1 + 2 + 3 + ... + n = n(n+1)/2 = (n^2 + n)/2, which is O(n^2).

Sean Nyman
+1  A: 

If you use + only once, you have no disadvantage from it and it increases readability (as Colin Pickard already stated).

As far as I know + means: take left operand and right operand and copy them into a new buffer (as strings are immutable).

So using + two times (as in Colin Pickards example you already create 2 temporary strings. First when adding "<p>" to intro and then when adding "</p>" to the newly created string.

You have to consider for yourself when to use which method. Even for a small example like seen above performance drop can be serious if intro is a large enough string.

StampedeXV
A: 

I think in terms of complexity you trade reiteration of newly created strings for parsing format string. For axample "A" + "B" + "C" + "D" means that you would have to copy "A", "AB", and at last "ABC" in order to form "ABCD". Copying is reiteration, right? So if for example you have a 1000 character string that you will sum with thousand one character strings you will copy (1000+N) character strings 1000 times. It leads to O(n^2) complexity in worst cases.

Strin.Fomat, even considering parsing, and StringBuffer should be around O(n).

Ivan
Wrong if this is a literal...
ck
A: 

Compiler will optimize: "a" + "b" + "c" to be replaced with String.Concat method (not String.Format one as fixed me comments)

Kamarey
`String.Concat` actually.
dtb
agree
Kamarey
@kamarey if you agree with dtb's comment why don't you edit your answer or delete it?
JohannesH
+1  A: 

StringBuilder should be used if you are building a large string in serveral steps. It also is a good thing if you know about how large it will be eventually, then you can initialize it with the size you need, and prevent costing re-allocations. For small operations it will not be considerable performance loss using the + operator, and it will result in clearer code (and faster to write...)

awe
+1  A: 

Unless your application is very string-intensive (profile, profile, profile!), this doesn't really matter. Good programmers put readability above performance for mundane operations.

fbonnet
+8  A: 
Robert Cartaino
This is a good read. It made me stop caring about how strings were done. Remember don't optimize until you have too.
David Basarab
The question was about complexity. Jeff's article is about performance.
Colin Pickard
A: 

Because strings are immutable in languages like Java and C#, everytime two strings are concatenated a new string has to be created, in which the contents of the two old strings are copied.

Assume strings which are on average c characters long.

Now the first concatenation only has to copy 2*c characters, but the last one has to copy the concatenation of the first n-1 strings, which is (n-1)*c characters long, and the last one itself, which is c characters long, for a total of n*c characters. For n concatenations this makes n^2*c/2 character copies, which means an algorithmic complexity of O(n^2).

In most cases in practice however this quadratic complexity will not be noticeable (as Jeff Atwood shows in the blog entry linked to by Robert C. Cartaino) and I'd advise to just write the code as readable as possible.

There are cases however when it does matter, and using O(n^2) in such cases may be deadly.

In practice I've seen this for example for generating big Word XML files in memory, including base64 encoded pictures. This generation used to take over 10 minutes due to using O(n^2) string concatenation. After I replaced concatenation using + with StringBuilder the running time for the same document reduced below 10 seconds.

Similarly I've seen a piece of software that generated an epically big piece of SQL code as a string using + for concatenation. I haven't even waited till this finished (had been waiting for over an hour already), but just rewrote it using StringBuilder. This faster version finished within a minute.

In short, just do whatever is most readable / easiest to write and only think about this when you'll be creating a freaking huge string :-)

Tobi
A: 

Plenty of input already, but I've always felt that the best way to approach the issue of performance is to understand the performance differences of all viable solutions and for those that meet performance requirements, pick the one that is the most reliable and the most supportable.

There are many who use Big O notation to understand complexity, but I've found that in most cases (including understanding which string concatenation methods work best), a simple time trial will suffice. Just compare strA+strB to strA.Append(strB) in a loop of 100,000 iterations to see which works faster.

Mayo
A: 

The compiler optimizes string literal concatenation into one string literal. For example:

string s = "a" + "b" + "c";

is optimized to the following at compile time:

string s = "abc";

See this question and this MSDN article for more information.

John Berberich
A: 

I benchmarked this forever ago, and it hasn't really made a differences since .NET 1.0 or 1.1.

Back then if you had some process that was going to hit a line of code that was concatinating strings a few million times you could get a huge speed increase by using String.Concat, String.Format, or StringBuilder.

Now it doesn't matter at all. At least it hasn't mattered since .Net 2.0 came out anyhow. Put it out of your mind and code in whatever manner makes it easiest for you to read.

blesh