ansaurus

Question

Why should I never use an unsafe block to modify a string?

Answer 1

+9 A:

The .Net framework requires strings to be immutable. Due to this requirement it is able to optimise all sorts of operations.

String interning is one great example of this requirement is leveraged heavily. To speed up some string comparisons (and reduce memory consumption) the .Net framework maintains a Dictionary of pointers, all pre-defined strings will live in this dictionary or any strings where you call the String.intern method on. When the IL instruction ldstr is called it will check the interned dictionary and avoid memory allocation if we already have the string allocated, note: String.Concat will not check for interned strings.

This property of the .net framework means that if you start mucking around directly with strings you can corrupt your intern table and in turn corrupt other references to the same string.

For example:

         // these strings get interned
        string hello = "hello";
        string hello2 = "hello";

        string helloworld, helloworld2;

        helloworld = hello;
        helloworld += " world";

        helloworld2 = hello;
        helloworld2 += " world"; 

        unsafe
        {
            // very bad, this changes an interned string which affects 
            // all app domains.
            fixed (char* str = hello2)
            {
                *str = 'X';
            }

            fixed (char* str = helloworld2)
            {
                *str = 'X';
            }

        }

        Console.WriteLine("hello = {0} , hello2 = {1}", hello, hello2);
        // output: hello = Xello , hello2 = Xello  


        Console.WriteLine("helloworld = {0} , helloworld2 = {1}", helloworld, helloworld2);
        // output : helloworld = hello world , helloworld2 = Xello world

Sam Saffron 2008-10-23 11:15:05

Answer 2

+11 A:

Are there any reasons why I should never ever do this?

Yes, very simple: Because .NET relies on the fact that strings are immutable. Some operations (e.g. s.SubString(0, s.Length)) actually return a reference to the original string. If this now gets modified, all other references will as well.

Better use a StringBuilder to modify a string since this is the default way.

Konrad Rudolph 2008-10-23 11:16:11

Answer 3

A:

Oh dear lord yes.

1) Because that class is not designed to be tampered with.

2) Because strings are designed and expected throughout the framework to be immutable. That means that code that everyone else writes (including MSFT) is expecting a string's underlying value never to change.

3) Because this is premature optimization and that is E V I L.

Dave Markle 2008-10-23 11:18:10

Answer 4

+1 A:

Put it this way: how would you feel if another programmer decided to replace 0 with 1 everywhere in your code, at execution time? It would play hell with all your assumptions. The same is true with strings. Everyone expects them to be immutable, and codes with that assumption in mind. If you violate that, you are likely to introduce bugs - and they'll be really hard to trace.

Jon Skeet 2008-10-23 11:35:36

I can just imagine them being used in a Dictionary, oh man that would play havoc!

leppie 2008-10-23 11:46:12

Answer 5

A:

Agreed about StringBuilder, or just convert your string to an array of chars/bytes and work there. Also, you gave the example of "upcasing" -- the String class has a ToUpper method, and if that's not at least as fast as your unsafe "upcasing", I'll eat my hat.

Coderer 2008-10-23 17:46:51

ansaurus

tags:

views:

answers:

Why should I never use an unsafe block to modify a string?

related questions