views:

1929

answers:

8

My question is this: Is string concatenation in C# safe? If string concatenation leads to unexpected errors, and replacing that string concatenation by using StringBuilder causes those errors to disappear, what might that indicate?

Background: I am developing a small command line C# application. It takes command line arguments, performs a slightly complicated SQL query, and outputs about 1300 rows of data into a formatted XML file.

My initial program would always run fine in debug mode. However, in release mode it would get to about the 750th SQL result, and then die with an error. The error was that a certain column of data could not be read, even through the Read() method of the SqlDataReader object had just returned true.

This problem was fixed by using StringBuilder for all operations in the code, where previously there had been "string1 + string2". I'm not talking about string concatenation inside the SQL query loop, where StringBuilder was already in use. I'm talking about simple concatenations between two or three short string variables earlier in the code.

I had the impression that C# was smart enough to handle the memory management for adding a few strings together. Am I wrong? Or does this indicate some other sort of code problem?

+11  A: 

String concatenation is safe though more memory intensive than using a StringBuilder if contatenating large numbers of strings in a loop. And in extreme cases you could be running out of memory.

It's almost certainly a bug in your code.

Maybe you're contatenating a very large number of strings. Or maybe it's something else completely different.

I'd go back to debugging without any preconceptions of the root cause - if you're still having problems try to reduce it to the minimum needed to repro the problem and post code.

Joe
A: 

When compounding strings together I always use StringBuilder. It's designed for it and is more efficient that simply using "string1 + string2".

patjbs
Bad advice. string1+string2 is faster than StringBuilder in many situations: StringBuilder wins when doing a large number of concatenations in a loop.
Joe
Thanks for letting me know. I had read that it was faster in "C# via CLR", but if that's not the case, it's good to know.
patjbs
I just ran a number of tests and even concatanating two short strings via "str1 + str2" IS indeed slower, or at the best no faster, than utilizing StringBuilder. And several excellent blog articles I've since read tend to agree with that conclusion. While utilzing str1+str2 may be a handy convenience and not a significant performance impact on small string concatanation, StringBuilder is certainly no worse, and will win out as the number of concats grows (as little as 4 the difference is sig)
patjbs
+7  A: 

Apart from what you're doing is probably best done with XML APIs instead of strings or StringBuilder I doubt that the error you see is due to string concatenation. Maybe switching to StringBuilder just masked the error or went over it gracefully, but I doubt using strings really was the cause.

Joey
Thanks everyone for your responses. I still ended up seeing errors for certain data sets after all of my changes. I think the real root of the problem was in the SQL connection, and that "switching to StringBuilder just masked the error" as Johannes said.I fixed the problem by making use of an SQL wrapper class from another project. This class converts the entire SQL result set to a Dictionary object, so that there is no need to keep the result set and the SQL connection open.
Bryan Roach
+3  A: 

How long would it take the concatenation version vs the string builder version? It's possible that your connection to the DB is being closed. If you are doing a lot of concatenation, i would go w/ StringBuilder as it is a bit more efficient.

Darren Kopp
+1  A: 

One cause may be that strings are immutable in .Net so when you do an operation on one such as concatenation you are actually creating a new string.

Another possible cause is that string length is an int so the maximum possible length is Int32.MaxValue or 2,147,483,647.

In either case a StringBuilder is better than "string1 + string2" for this type of operation. Although, using the built-in XML capabilities would be even better.

Jeremy
A: 

Here is my shot in the dark...

Strings in .NET (not stringbuilders) go into the String Intern Pool. This is basically an area managed by the CLR to share strings to improve performance. There has to be some limit here, although I have no idea what that limit is. I imagine all the concatenation you are doing is hitting the ceiling of the string intern pool. So SQL says yes I have a value for you, but it can't put it anywhere so you get an exception.

A quick and easy test would be to nGen your assembly and see if you still get the error. After nGen'ing, you application no longer will use the pool.

If that fails, I'd contact Microsoft to try and get some hard details. I think my idea sounds plausible, but I have no idea why it works in debug mode. Perhaps in debug mode strings aren't interned. I am also no expert.

Bob
+5  A: 

To answer your question: String contatenation in C# (and .NET in general) is "safe", but doing it in a tight loop as you describe is likely to cause severe memory pressure and put strain on the garbage collector.

I would hazard a guess that the errors you speak of were related to resource exhaustion of some sort, but it would be helpful if you could provide more detail — for example, did you receive an exception? Did the application terminate abnormally?

Background: .NET strings are immutable, so when you do a concatenation like this:

var stringList = new List<string> {"aaa", "bbb", "ccc", "ddd", //... };
string result = String.Empty;
foreach (var s in stringList)
{
    result = result + s;
}

This is roughly equivalent to the following:

string result = "";
result = "aaa"
string temp1 = result + "bbb";
result = temp1;
string temp2 = temp1 + "ccc";
result = temp2;
string temp3 = temp2 + "ddd";
result = temp3;
// ...
result = tempN + x;

The purpose of this example is to emphasise that each time around the loop results in the allocation of a new temporary string.

Since the strings are immutable, the runtime has no alternative options but to allocate a new string each time you add another string to the end of your result.

Although the result string is constantly updated to point to the latest and greatest intermediate result, you are producing a lot of these un-named temporary string that become eligible for garbage collection almost immediately.

At the end of this concatenation you will have the following strings stored in memory (assuming, for simplicity, that the garbage collector has not yet run).

string a = "aaa";
string b = "bbb";
string c = "ccc";
// ...
string temp1 = "aaabbb";
string temp2 = "aaabbbccc";
string temp3 = "aaabbbcccddd";
string temp4 = "aaabbbcccdddeee";
string temp5 = "aaabbbcccdddeeefff";
string temp6 = "aaabbbcccdddeeefffggg";
// ...

Although all of these implicit temporary variables are eligible for garbage collection almost immediately, they still have to be allocated. When performing concatenation in a tight loop this is going to put a lot of strain on the garbage collector and, if nothing else, will make your code run very slowly. I have seen the performance impact of this first hand, and it becomes truly dramatic as your concatenated string becomes larger.

The recommended approach is to always use a StringBuilder if you are doing more than a few string concatenations. StringBuilder uses a mutable buffer to reduce the number of allocations that are necessary in building up your string.

Daniel Fortunov
Excellent answer :)
patjbs
I'm fairly sure a + b + c + d ... in a single statement doesn't generate the intermediate strings. It works more like String.Concat.
Joe
Joe: You're quite right. I went too far in simplifying the answer. I've added a loop to make it more factually correct now. Is that better?
Daniel Fortunov
Absolutely. The key point is that many repeated concatenations (e.g. in a loop) are much more efficient with StringBuilder.
Joe
A: 

string.Concat(string[]) is by far the fastest way to concatenate strings. It litterly kills StringBuilder in performance when used in loops, especially if you create the StringBuilder in each iteration. There are loads of references if you Google "c# string format vs stringbuilder" or something like that. http://www.codeproject.com/KB/cs/StringBuilder_vs_String.aspx gives you an ideer about the times. Here string.Join wins the concatenation test but I belive this is because the string.Concat(string, string) is used instead of the overloaded version that takes an array. If you take a look at the MSIL code that is generated by the different methods you'll see what going on beneath the hood.

Paw Baltzersen