views:

806

answers:

2

Edit for intro:
We know that a ref parameter in C# passes a reference to a variable, allowing the external variable itself to be changed within a called method. But is the reference handled much like a C pointer (reading the current contents of the original variable with every access to that parameter and changing the original variable with every modification to the parameter), or can the called method rely on a consistent reference for the duration of the call? The former brings up some thread safety issues. In particular:

I've written a static method in C# which passes an object by reference:

public static void Register(ref Definition newDefinition) { ... }

The caller provides a completed but not-yet-registered Definition object, and after some consistency checking we "register" the definition they provided. However, if there is already a definition with the same key, it can't register the new one and instead their reference is updated to the "official" Definition for that key.

We want this to be rigorously thread-safe, but a pathological scenario comes to mind. Suppose the client (using our library) shares the reference in an non-thread-safe manner, such as using a static member rather than a local variable:

private static Definition riskyReference = null;

If one thread sets riskyReference = new Definition("key 1");, fills out the definition, and calls our Definition.Register(ref riskyReference); while another thread also decides to set riskyReference = new Definition("key 2");, are we guaranteed that in our Register method the newDefinition reference we're handling will not be modified on us by other threads (because the reference to the object was copied in and will be copied out when we return?), or can that other thread replace the object on us in the middle of our execution (if we're referencing a pointer to the original storage location???) and thus break our sanity-checking?

Note that this is different from changes to the underlying object itself, which are of course possible for a reference type (class), but can be easily guarded against by appropriate locking within that class. We can't, however, guard changes to an external client's variable space itself! We would have to make our own copy of the parameter at the top of the method and overwrite the parameter at the bottom (for example), but that would seem to make more sense for the compiler to do for us given the insanity of handling an unsafe reference.

So, I'd tend to think that the reference may be copied in and copied out by the compiler so that the method is handling a consistent reference to the original object (until it changes its own reference when it wants to) regardless of what might be happening to the original location on other threads. But we're having trouble finding a definitive answer on that point in documenation and discussion of ref parameters.

Can anyone assuage my concern with a definitive citation?

Edit for conclusion:
Having confirmed it with a multi-threaded code example (Thanks Marc!) and thinking about it further, it makes sense that it is indeed the not-automatically-threadsafe behavior which I was worred about. One point of "ref" is to pass large structs by reference rather than copy them. Another reason is that you might want to set up a long-term monitoring of a variable and need to pass a reference to it which will see changes to the variable (eg. changing between null and a live object), which an automatic copy-in/copy-out would not allow for.

So, to make our Register method robust against client insanity, we could implement it like:

public static void Register(ref Definition newDefinition) {
    Definition theDefinition = newDefinition; // Copy in.
    //... Sanity checks, actual work...
    //...possibly changing theDefinition to a new Definition instance...
    newDefinition = theDefinition; // Copy out.
}

They'd still have their own threading issues as far as what they end up getting, but at least their insanity wouldn't break our own sanity-checking process and possibly slip a bad state past our checks.

+5  A: 

When you use ref, you are passing the address of the caller's field/variable. Therefore yes: two threads can compete over the field/variable - but only if they both are talking to that field/variable. If they have a different field/variable to the same instance, then things are sane (assuming it is immutable).

For example; in the code below, Register does see the changes that Mutate makes to the variable (each object instance is effectively immutable).

using System;
using System.Threading;
class Foo {
    public string Bar { get; private set; }
    public Foo(string bar) { Bar = bar; }
}
static class Program {
    static Foo foo = new Foo("abc");
    static void Main() {
        new Thread(() => {
            Register(ref foo);
        }).Start();
        for (int i = 0; i < 20; i++) {
            Mutate(ref foo);
            Thread.Sleep(100);
        }
        Console.ReadLine();
    }
    static void Mutate(ref Foo obj) {
        obj = new Foo(obj.Bar + ".");
    }
    static void Register(ref Foo obj) {
        while (obj.Bar.Length < 10) {
            Console.WriteLine(obj.Bar);
            Thread.Sleep(100);
        }
    }
}
Marc Gravell
Okay, that example *does* show that ref parameters in C# are *not* thread-safe copies, so the client doing something stupid and pathological could change what object we're handling in the middle of our method. We'd need to do our own copy-in/copy-out if we want to robustly guard against it. Thanks!
Rob Parker
+5  A: 

No, it's not "copy in, copy out". Instead, the variable itself is effectively passed in. Not the value, but the variable itself. Changes made during the method are visible to anything else looking at the same variable.

You can see this without any threading being involved:

using System;

public class Test
{
    static string foo;

    static void Main(string[] args)
    {
        foo = "First";
        ShowFoo();
        ChangeValue(ref foo);
        ShowFoo();
    }

    static void ShowFoo()
    {
        Console.WriteLine(foo);
    }

    static void ChangeValue(ref string x)
    {
        x = "Second";
        ShowFoo();
    }
}

The output of this is First, Second, Second - the call to ShowFoo() within ChangeValue shows that the value of foo has already changed, which is exactly the situation you're concerned about.

The solution

Make Definition immutable if it wasn't before, and change your method signature to:

public static Definition Register(Definition newDefinition)

Then the caller can replace their variable if they want to, but your cache can't be polluted by a sly other thread. The caller would do something like:

myDefinition = Register(myDefinition);
Jon Skeet
We specifically went to the ref parameter to avoid the requirement that the caller replace the variable with the results of the call, because they might fail to do it and it would usually work okay (uses same object), but sometimes could bite them. But you're right that it would avoid this problem.
Rob Parker
They might use a local variable instead of the right shared variable, which would bite them as well. You can't guarantee that the caller will be sensible, but you *can* guarantee that your own code behaves sensibly.
Jon Skeet
Yeah, but we want our API to be as foolproof as we can make it, not tend to make them screw it up. ;-) As long as they use a local variable for the new Definition (like they should), the use of ref makes it easier for them to get it right. My concern is paranoia about pathological usage breaking us.
Rob Parker