views:

326

answers:

8

Summary: C#/.NET is supposed to be garbage collected. C# has a destructor, used to clean resources. What happen when an object A is garbage collected the same line I try to clone one of its variable members? Apparently, on multiprocessors, sometimes, the garbage collector wins...

The problem

Today, on a training session on C#, the teacher showed us some code which contained a bug only when run on multiprocessors.

I'll summarize to say that sometimes, the compiler or the JIT screws up by calling the finalizer of a C# class object before returning from its called method.

The full code, given in Visual C++ 2005 documentation, will be posted as an "answer" to avoid making a very very large questions, but the essential are below:

The following class has a "Hash" property which will return a cloned copy of an internal array. At is construction, the first item of the array has a value of 2. In the destructor, its value is set to zero.

The point is: If you try to get the "Hash" property of "Example", you'll get a clean copy of the array, whose first item is still 2, as the object is being used (and as such, not being garbage collected/finalized):

public class Example
{
    private int nValue;
    public int N { get { return nValue; } }

    // The Hash property is slower because it clones an array. When
    // KeepAlive is not used, the finalizer sometimes runs before 
    // the Hash property value is read.

    private byte[] hashValue;
    public byte[] Hash { get { return (byte[])hashValue.Clone(); } }

    public Example()
    {
        nValue = 2;
        hashValue = new byte[20];
        hashValue[0] = 2;
    }

    ~Example()
    {
        nValue = 0;

        if (hashValue != null)
        {
            Array.Clear(hashValue, 0, hashValue.Length);
        }
    }
}

But nothing is so simple... The code using this class is wokring inside a thread, and of course, for the test, the app is heavily multithreaded:

public static void Main(string[] args)
{
    Thread t = new Thread(new ThreadStart(ThreadProc));
    t.Start();
    t.Join();
}

private static void ThreadProc()
{
    // running is a boolean which is always true until
    // the user press ENTER
    while (running) DoWork();
}

The DoWork static method is the code where the problem happens:

private static void DoWork()
{
    Example ex = new Example();

    byte[] res = ex.Hash; // [1]

    // If the finalizer runs before the call to the Hash 
    // property completes, the hashValue array might be
    // cleared before the property value is read. The 
    // following test detects that.

    if (res[0] != 2)
    {
        // Oops... The finalizer of ex was launched before
        // the Hash method/property completed
    }
}

Once every 1,000,000 excutions of DoWork, apparently, the Garbage Collector does its magic, and tries to reclaim "ex", as it is not anymore referenced in the remaning code of the function, and this time, it is faster than the "Hash" get method. So what we have in the end is a clone of a zero-ed byte array, instead of having the right one (with the 1st item at 2).

My guess is that there is inlining of the code, which essentially replaces the line marked [1] in the DoWork function by something like:

    // Supposed inlined processing
    byte[] res2 = ex.Hash2;
    // note that after this line, "ex" could be garbage collected,
    // but not res2
    byte[] res = (byte[])res2.Clone();

If we supposed Hash2 is a simple accessor coded like:

// Hash2 code:
public byte[] Hash2 { get { return (byte[])hashValue; } }

So, the question is: Is this supposed to work that way in C#/.NET, or could this be considered as a bug of either the compiler of the JIT?

edit

See Chris Brumme's and Chris Lyons' blogs for an explanation.

http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx
http://blogs.msdn.com/clyon/archive/2004/09/21/232445.aspx

Everyone's answer was interesting, but I couldn't choose one better than the other. So I gave you all a +1...

Sorry

:-)

Edit 2

I was unable to reproduce the problem on Linux/Ubuntu/Mono, despite using the same code on the same conditions (multiple same executable running simultaneously, release mode, etc.)

A: 

The Full Code

You'll find below the full code, copy/pasted from a Visual C++ 2008 .cs file. As I'm now on Linux, and without any Mono compiler or knowledge about its use, there's no way I can do tests now. Still, a couple of hours ago, I saw this code work and its bug:

using System;
using System.Threading;

public class Example
{
    private int nValue;
    public int N { get { return nValue; } }

    // The Hash property is slower because it clones an array. When
    // KeepAlive is not used, the finalizer sometimes runs before 
    // the Hash property value is read.

    private byte[] hashValue;
    public byte[] Hash { get { return (byte[])hashValue.Clone(); } }
    public byte[] Hash2 { get { return (byte[])hashValue; } }

    public int returnNothing() { return 25; }

    public Example()
    {
        nValue = 2;
        hashValue = new byte[20];
        hashValue[0] = 2;
    }

    ~Example()
    {
        nValue = 0;

        if (hashValue != null)
        {
            Array.Clear(hashValue, 0, hashValue.Length);
        }
    }
}

public class Test
{
    private static int totalCount = 0;
    private static int finalizerFirstCount = 0;

    // This variable controls the thread that runs the demo.
    private static bool running = true;

    // In order to demonstrate the finalizer running first, the
    // DoWork method must create an Example object and invoke its
    // Hash property. If there are no other calls to members of
    // the Example object in DoWork, garbage collection reclaims
    // the Example object aggressively. Sometimes this means that
    // the finalizer runs before the call to the Hash property
    // completes. 

    private static void DoWork()
    {
        totalCount++;

        // Create an Example object and save the value of the 
        // Hash property. There are no more calls to members of 
        // the object in the DoWork method, so it is available
        // for aggressive garbage collection.

        Example ex = new Example();

        // Normal processing
        byte[] res = ex.Hash;

        // Supposed inlined processing
        //byte[] res2 = ex.Hash2;
        //byte[] res = (byte[])res2.Clone();

        // successful try to keep reference alive
        //ex.returnNothing();

        // Failed try to keep reference alive
        //ex = null;

        // If the finalizer runs before the call to the Hash 
        // property completes, the hashValue array might be
        // cleared before the property value is read. The 
        // following test detects that.

        if (res[0] != 2)
        {
            finalizerFirstCount++;
            Console.WriteLine("The finalizer ran first at {0} iterations.", totalCount);
        }

        //GC.KeepAlive(ex);
    }

    public static void Main(string[] args)
    {
        Console.WriteLine("Test:");

        // Create a thread to run the test.
        Thread t = new Thread(new ThreadStart(ThreadProc));
        t.Start();

        // The thread runs until Enter is pressed.
        Console.WriteLine("Press Enter to stop the program.");
        Console.ReadLine();

        running = false;

        // Wait for the thread to end.
        t.Join();

        Console.WriteLine("{0} iterations total; the finalizer ran first {1} times.", totalCount, finalizerFirstCount);
    }

    private static void ThreadProc()
    {
        while (running) DoWork();
    }
}

For those interested, I can send the zipped project through email.

paercebal
Is the zipped project the same as what you posted here?
Scott Dorman
More or less. I had to add 4 spaces before each line begining, and as thsoe files were Windows', on my Linux, the \r\n carriage return/line feed added unnecessary empty lines I removed by end.
paercebal
+1  A: 

this looks like a race condition between your work thread and the GC thread(s); to avoid it, i think there are two options:

(1) change your if statement to use ex.Hash[0] instead of res, so that ex cannot be GC'd prematurely, or

(2) lock ex for the duration of the call to Hash

that's a pretty spiffy example - was the teacher's point that there may be a bug in the JIT compiler that only manifests on multicore systems, or that this kind of coding can have subtle race conditions with garbage collection?

Steven A. Lowe
Apparently, the teacher believed it is a bug. The reasoning is: The compiler should know we are still using the object, and thus, delay its destruction until the next line. The very same code in C++ was correctly working.
paercebal
That line of reasoning works in unmanaged C++ because it is up to the developer to know when to release the memory. In a garbage collected world, the runtime thinks that ex is no longer being used and is eligible for collection.
Scott Dorman
I guess technically it's a bug, but one that is inherit in a garbage collected system when running multiple (and/or long running) threads.
Scott Dorman
The C++ code which worked was managed. Not unmanaged. It's the reason the teacher believed something was wrong. Somehow, the code generated by the Manager C++ compiler protects the object while it's being used, and the one generated by the C# compiler does not.
paercebal
... My own take is that it is the kind of things that can happen when you have on one hand garbage collection, and the other, destructors.
paercebal
[@paercebal]: examine the MSIL generated by C# and C++ to see what the difference is for that function; it may be that the C++ compiler optimized away the intermediate variable as unnecessary...
Steven A. Lowe
Ok...the my earlier comments are invalid. Look at the IL generated, which will probably show what Steven suspsects. Also, make sure that both the C++ and C# builds were both "Release" builds as "Debug" builds add some extra goodness for the GC.
Scott Dorman
+1  A: 

I think what you are seeing is reasonable behavior due to the fact that things are running on multiple threads. This is the reason for the GC.KeepAlive() method, which should be used in this case to tell the GC that the object is still being used and that it isn't a candidate for cleanup.

Looking at the DoWork function in your "full code" response, the problem is that immediately after this line of code:

byte[] res = ex.Hash;

the function no longer makes any references to the ex object, so it becomes eligible for garbage collection at that point. Adding the call to GC.KeepAlive would prevent this from happening.

Scott Dorman
Yes, quite true. The point is: Can my "ex" object be finalized the very same line I use one of its get property?
paercebal
If I understand what you're asking...yes. In .NET an object is still "available" even after it has been disposed so you could still call it, although the results are not guaranteed. I think this is part of what you are seeing here.
Scott Dorman
Err... No: My problem is that the very same **code line** I'm using my "ex" object (so there is still a live a reference there), it is "stolen" in a very "carpet under my feet way" by the garbage collector. I would have believed the GC would wait at least the next line to collect the object.
paercebal
So you're saying that the first time the byte[] res = ex.Hash; line is run you hit the issue?
Scott Dorman
+1  A: 

That's perfectly nornal for the finalizer to be called in your do work method as after the ex.Hash call, the CLR knows that the ex instance won't be needed anymore...

Now, if you want to keep the instance alive do this:

private static void DoWork()
{
    Example ex = new Example();

    byte[] res = ex.Hash; // [1]

    // If the finalizer runs before the call to the Hash 
    // property completes, the hashValue array might be
    // cleared before the property value is read. The 
    // following test detects that.

    if (res[0] != 2) // NOTE
    {
        // Oops... The finalizer of ex was launched before
        // the Hash method/property completed
    }
  GC.KeepAlive(ex); // keep our instance alive in case we need it.. uh.. we don't
}

GC.KeepAlive does... nothing :) it's an empty not inlinable /jittable method whose only purpose is to trick the GC into thinking the object will be used after this.

WARNING: Your example is perfectly valid if the DoWork method were a managed C++ method... You DO have to manually keep the managed instances alive manually if you don't want the destructor to be called from within another thread. IE. you pass a reference to a managed object who is going to delete a blob of unmanaged memory when finalized, and the method is using this same blob. If you don't hold the instance alive, you're going to have a race condition between the GC and your method's thread.

And this will end up in tears. And managed heap corruption...

Palad1
You should probably call GC.KeepAlive(ex) rather than GC.SuppressFinalize(ex) as KeepAlive is designed specifically for these instances where SupressFinalize is meant to be used in a different context, namely to prevent finalization of an object that has already been Disposed.
Scott Dorman
Scott Dorman is right about KeepAlive.
paercebal
Now, my problem, no matter the thread context: In the body of one function, my object is finalized the very same line I'm using its reference to call its get property. Is this behaviour normal?
paercebal
it is, thanks, I corrected.I was thinking about giving a pointer to the ROTOR gc implementation and ADD took over :D
Palad1
+1  A: 

What you're seeing is perfectly natural.

You don't keep a reference to the object that owns the byte array, so that object (not the byte array) is actually free for the garbage collector to collect.

The garbage collector really can be that aggressive.

So if you call a method on your object, which returns a reference to an internal data structure, and the finalizer for your object mess up that data structure, you need to keep a live reference to the object as well.

The garbage collector sees that the ex variable isn't used in that method any more, so it can, and as you notice, will garbage collect it under the right circumstances (ie. timing and need).

The correct way to do this is to call GC.KeepAlive on ex, so add this line of code to the bottom of your method, and all should be well:

GC.KeepAlive(ex);

I learned about this aggressive behavior by reading the book Applied .NET Framework Programming by Jeffrey Richter.

Lasse V. Karlsen
"I learned about this aggressive behavior"... : So, you're confirming this behaviour, visible only when launching this code on a multicore processor, is a ***normal*** behaviour, and that this is described in the book whose title you're mentioning on your post?
paercebal
... [continue] ... because the same code, using the same class, created with the garbage collector facitilies (gcnew instead of new) works quite fine on Managed C++.
paercebal
It has nothing to do with multicore, it has to do with multithreaded, on a server .NET runtime, you can possibly see the same behavior, if you run on a busy single-core system.
Lasse V. Karlsen
If you read the other answer on this question, by paercebal, quoting some text from cbrumme, you'll find a far better explanation than mine. But yes, this is indeed natural, and you need to ensure the object is alive for the entire duration of your call.
Lasse V. Karlsen
GC.KeepAlive is the wrong way to deal with this. The right way is to get rid of the finalizer.
Joe
+1  A: 

Yes, this is an issue that has come up before.

Its even more fun in that you need to run release for this to happen and you end up stratching your head going 'huh, how can that be null?'.

Jack Bolding
Very good answer. The links above are interesting, and the secondth mentions the author of the text at following URL : http://blogs.msdn.com/cbrumme/archive/2004/02/20/77460.aspx
paercebal
Yes, the race condition issue has come up before but I think that condition will happen in any of the .NET languages. Based on comments on other posts, it appears that this only happens in the C# sample, the C++/CLI version of the sample doesn't exhibit this problem.
Scott Dorman
I'm wondering... Perhaps it is **normal** for .NET, and it is Managed C++ that adds extra code... Tomorrow, I'll ask for a VB.NET and a Managed C++ version of the code to see if the race condition appears. If I get those sources, I'll post them bellow and give your all the results for comments.
paercebal
A: 

Interesting comment from Chris Brumme's blog

http://blogs.msdn.com/cbrumme/archive/2003/04/19/51365.aspx

class C {<br>
   IntPtr _handle;
   Static void OperateOnHandle(IntPtr h) { ... }
   void m() {
      OperateOnHandle(_handle);
      ...
   }
   ...
}

class Other {
   void work() {
      if (something) {
         C aC = new C();
         aC.m();
         ...  // most guess here
      } else {
         ...
      }
   }
}

So we can’t say how long ‘aC’ might live in the above code. The JIT might report the reference until Other.work() completes. It might inline Other.work() into some other method, and report aC even longer. Even if you add “aC = null;” after your usage of it, the JIT is free to consider this assignment to be dead code and eliminate it. Regardless of when the JIT stops reporting the reference, the GC might not get around to collecting it for some time.

It’s more interesting to worry about the earliest point that aC could be collected. If you are like most people, you’ll guess that the soonest aC becomes eligible for collection is at the closing brace of Other.work()’s “if” clause, where I’ve added the comment. In fact, braces don’t exist in the IL. They are a syntactic contract between you and your language compiler. Other.work() is free to stop reporting aC as soon as it has initiated the call to aC.m().

paercebal
+5  A: 

It's simply a bug in your code: finalizers should not be accessing managed objects.

The only reason to implement a finalizer is to release unmanaged resources. And in this case, you should carefully implement the standard IDisposable pattern.

With this pattern, you implement a protected method "protected Dispose(bool disposing)". When this method is called from the finalizer, it cleans up unmanaged resources, but does not attempt to clean up managed resources.

In your example, you don't have any unmanaged resources, so should not be implementing a finalizer.

Joe
You just beat me to it! I don't know why everyone is harping on threads and JIT and "agressive behavior", when they're all missing the real problem, which is just as you describe.
Jeffrey L Whitledge
Interesting answer. I'll think about it. Thanks.
paercebal
Despite the other answers were true (AFAIK), yours have the greatest impact on my perception of this Dispose pattern. Thanks. +1
paercebal