views:

254

answers:

4

In .NET, strings are immutable and are reference type variables. This often comes as a surprise to newer .NET developers who may mistake them for value type objects due to their behavior. However, other than the practice of using StringBuilder for long concatenation esp. in loops, is there any reason in practice that one needs to know this distinction?

What real-world scenarios are helped or avoided by understanding the value-reference distinction with regard to .NET strings vs. just pretending/misunderstanding them to be value types?

+1  A: 

String is a special breed. They are reference type yet used by most coders as a value type. By making it immutable and using the intern pool, it optimizes memory usage which will be huge if it's a pure value type.

More readings here:
C# .NET String object is really by reference? on SO
String.Intern Method on MSDN
string (C# Reference) on MSDN

Update:
Please refer to abel's comment to this post. It corrected my misleading statement.

o.k.w
*optimizes memory usage which will be huge if it's a pure value type.*? I'm afraid it is rather the opposite. Strings are not optimized and surely not for memory usage. String.Intern is only used occasionally when the string is considered constant (not meaning *immutable*, but meaning that it is not assigned another value). In all other cases, strings are repeatedly memcopied when reassigned or changed, which is bad for performance.
Abel
@Abel: You are right. I shall still leave my answer here so others can read your helpful comment. Unless of course you choose to answer :)
o.k.w
Can do both, just decided to elaborate a bit :)
Abel
+3  A: 

The only distinction that really matters for most code is the fact that null can be assigned to string variables.

recursive
+17  A: 
Abel
I must say, very well worded/structured/formatted and concise answer :)
o.k.w
concise?? lol, but thanks ;). I'll leave out the *"real story behind strings actually being mutable, contrary to MS's claims"*, unless someone asks...
Abel
@Abel: How about I ask, you answer? That will be great to read. :P
o.k.w
Great answer! Thank you much
Dinah
Abel
The turning point in concatenation performance **shocks** me. I would have suspected it to be two orders of magnitude lower: at around 4, not 400 concatenations. Seeing these profiling results, I can’t help wondering how `StringBuilder` ended up so inefficient.
Konrad Rudolph
It shocks many people, and many people don't even believe it. The turning point can vary largely. Using concat `+` with large strings, the turning point can be only after several 1000s, when using concat with very small strings, the turning point can be after 50 or so. Mixing with replace, regex, or `Insert` makes both perform about even. Structure of code, using const or dynamic strings etc can have another large influence.
Abel
+3  A: 

An immutable class acts like a value type in all common situations, and you can do quite a lot of programming without caring much about the difference.

It's when you dig a little deeper and care about performance that you have real use for the distinction. For example to know that although passing a string as a parameter to a method acts as if a copy of the string is created, the copying doesn't actually take place. This might be a surprise for people used to languages where strings actually are value types (like VB6?), and passing a lot of strings as parameters would not be good for performance.

Guffa
actually, C# (or better: .NET) uses *"copy-on-write"* for strings. That means that if you pass it as a parameter, indeed a reference is passed, but as soon as you try to change it, a local copy is created, and is assigned the new value. The passed string parameter is left untouched. Use `ref` on strings if you want the parameter to change as well.
Abel