ansaurus

Question

Answer 1

+5 A:

Well, I don't think it's stupid to strive for greater type safety, but I do think you're on the wrong track. What about simply creating getters and setters instead of using public fields?

For example:

public object ValueA { get; private set; }

public void SetValueA( int value ) { ValueA = value; }
public void SetValueA( string value ) { ValueA = value; }
public void SetValueA( Foo value ) { ValueA = value; }

Peter Ruderman 2010-06-30 17:24:43

I like this idea. Usually, having a hard time doing something is indicative of the fact that you are not doing it in the way that's natural for the language. I think this is a more natural approach for C#.

jdmichal 2010-06-30 17:38:52

@Peter. I shied away from this solution since it would require 3 setters and 3 getters for ValueA and another 3 + 3 for ValueB (plus I would still need a method to determine which setter had been called in the case where 2 or more of the possible types were reference types.

Chris F 2010-06-30 20:14:41

@jdmichal. I agree that usually if it seems unduly difficult there is often an easier solution that has been overlooked (like that proposed by Jaraslav) but I personally think this solution is a little too inelegant

Chris F 2010-06-30 20:17:19

Answer 2

+1 A:

You could throw exceptions once there's an attempt to access variables that haven't been initialized, ie if it's created with an A parameter and later on there's an attempt to access B or C, it could throw, say, UnsupportedOperationException. You'd need a getter to make it work though.

mr popo 2010-06-30 17:38:07

Yes - the first version that I wrote did raise exception in the As method - but whilst this certainly highlights the problem in the code, I much prefer to be told about this at compile time than at runtime.

Chris F 2010-06-30 20:19:54

Answer 3

+2 A:

char foo = 'B';

bool bar = foo is int;

This results in a warning, not an error. If you're looking for your Is and As functions to be analogs for the C# operators, then you shouldn't be restricting them in that way anyhow.

Adam Robinson 2010-06-30 17:39:30

Answer 4

+3 A:

I am not sure I fully understand your goal. In C, a union is a structure that uses the same memory locations for more than one field. For example:

typedef union
{
    float real;
    int scalar;
} floatOrScalar;

The floatOrScalar union could be used as a float, or an int, but they both consume the same memory space. Changing one changes the other. You can achieve the same thing with a struct in C#:

[StructLayout(LayoutKind.Explicit)]
struct FloatOrScalar
{
    [FieldOffset(0)]
    public float Real;
    [FieldOffset(0)]
    public int Scalar;
}

The above structure uses 32bits total, rather than 64bits. This is only possible with a struct. Your example above is a class, and given the nature of the CLR, makes no guarantee about memory efficiency. If you change a Union<A, B, C> from one type to another, you are not necessarily reusing memory...most likely, you are allocating a new type on the heap and dropping a different pointer in the backing object field. Contrary to a real union, your approach may actually cause more heap thrashing than you would otherwise get if you did not use your Union type.

jrista 2010-06-30 17:45:02

As I mentioned in my question, my motivation was not better memory efficiency. I have changed the question title to better reflect what my goal is - the original title of "C(ish) union" is in hindsight misleading

Chris F 2010-06-30 20:33:14

A discriminated union makes a whole lot more sense for what you are trying to do. As for making it compile-time checked...I would look into .NET 4 and Code Contracts. With Code Contracts, it may be possible to enforce a compile-time Contract.Requires that enforces your requirements on the .Is<T> operator.

jrista 2010-06-30 20:52:54

I guess I still have to question the use of a Union, in general practice. Even in C/C++, unions are a risky thing, and must be used with extreme care. I am curious why you need to bring such a construct into C#...what value do you perceive getting out of it?

jrista 2010-06-30 20:55:31

Answer 5

+2 A:

If you allow multiple types, you cannot achieve type safety (unless the types are related).

You can't and won't achieve any kind of type safety, you could only achieve byte-value-safety using FieldOffset.

It would make much more sense to have a generic ValueWrapper<T1, T2> with T1 ValueA and T2 ValueB, ...

P.S.: when talking about type-safety I mean compile-time type-safety.

If you need a code wrapper (performing bussiness logic on modifications you can use something along the lines of:

public class Wrapper
{
    public ValueHolder<int> v1 = 5;
    public ValueHolder<byte> v2 = 8;
}

public struct ValueHolder<T>
    where T : struct
{
    private T value;

    public ValueHolder(T value) { this.value = value; }

    public static implicit operator T(ValueHolder<T> valueHolder) { return valueHolder.value; }
    public static implicit operator ValueHolder<T>(T value) { return new ValueHolder<T>(value); }
}

For an easy way out you could use (it has performance issues, but it is very simple):

public class Wrapper
{
    private object v1;
    private object v2;

    public T GetValue1<T>() { if (v1.GetType() != typeof(T)) throw new InvalidCastException(); return (T)v1; }
    public void SetValue1<T>(T value) { v1 = value; }

    public T GetValue2<T>() { if (v2.GetType() != typeof(T)) throw new InvalidCastException(); return (T)v2; }
    public void SetValue2<T>(T value) { v2 = value; }
}

//usage:
Wrapper wrapper = new Wrapper();
wrapper.SetValue1("aaaa");
wrapper.SetValue2(456);

string s = wrapper.GetValue1<string>();
DateTime dt = wrapper.GetValue1<DateTime>();//InvalidCastException

Jaroslav Jandek 2010-06-30 17:52:36

Your suggestion of making ValueWrapper generic seems like the obvious answer but it causes me problems in what I am doing. Essentially, my code is creating these wrapper objects by parsing some text line. So I have a method like ValueWrapper MakeValueWrapper(string text). If I make the wrapper generic then I need to change the signature of MakeValueWrapper to be generic and then this in turns means that the calling code need to know what types are expected and I just don't know this in advance before I parse the text...

Chris F 2010-06-30 21:22:38

...but even as I was writing the last comment, it felt like I have perhaps missed something (or messed up something) because what I am trying to do does not feel as it should be as difficult as I am making it. I think I will go back and spend a few minutes working on a generified wrapper and see if I can adapt the parsing code around it.

Chris F 2010-06-30 21:25:19

The code I have provided is supposed to be just for bussiness logic.The problem with your approach is that you never know what value is stored in the Union at compile-time. It means you will have to use if or switch statements whenever you access the Union object, since those objects do not share a common functionality! How are you going to use the wrapper objects further in your code? Also you can construct generic objects at runtime (slow, but possible). Another easy option with is in my edited post.

Jaroslav Jandek 2010-06-30 23:43:45

You have basically no meaningful compile-time type checks in your code right now - you could also try dynamic objects (dynamic type checking at runtime).

Jaroslav Jandek 2010-07-01 00:07:03

Answer 6

A:

You can export a pseudo-pattern matching function, like I use for the Either type in my Sasa library. There's currently runtime overhead, but I eventually plan to add a CIL analysis to inline all the delegates into a true case statement.

naasking 2010-06-30 18:42:16

Answer 7

A:

It's not possible to do with exactly the syntax you've used but with a bit more verbosity and copy/paste it's easy to make overload resolution do the job for you:


// this code is ok
var u = new Union("");
if (u.Value(Is.OfType()))
{
    u.Value(Get.ForType());
}

// and this one will not compile
if (u.Value(Is.OfType()))
{
    u.Value(Get.ForType());
}

By now it should be pretty obvious how to implement it:


    public class Union
    {
        private readonly Type type;
        public readonly A a;
        public readonly B b;
        public readonly C c;

        public Union(A a)
        {
            type = typeof(A);
            this.a = a;
        }

        public Union(B b)
        {
            type = typeof(B);
            this.b = b;
        }

        public Union(C c)
        {
            type = typeof(C);
            this.c = c;
        }

        public bool Value(TypeTestSelector _)
        {
            return typeof(A) == type;
        }

        public bool Value(TypeTestSelector _)
        {
            return typeof(B) == type;
        }

        public bool Value(TypeTestSelector _)
        {
            return typeof(C) == type;
        }

        public A Value(GetValueTypeSelector _)
        {
            return a;
        }

        public B Value(GetValueTypeSelector _)
        {
            return b;
        }

        public C Value(GetValueTypeSelector _)
        {
            return c;
        }
    }

    public static class Is
    {
        public static TypeTestSelector OfType()
        {
            return null;
        }
    }

    public class TypeTestSelector
    {
    }

    public static class Get
    {
        public static GetValueTypeSelector ForType()
        {
            return null;
        }
    }

    public class GetValueTypeSelector
    {
    }

There are no checks for extracting the value of the wrong type, e.g.:


var u = Union(10);
string s = u.Value(Get.ForType());

So you might consider adding necessary checks and throw exceptions in such cases.

Konstantin Oznobihin 2010-07-01 07:59:31

Answer 8

A:

Here is my attempt. It does compile time checking of types, using generic type constraints.

class Union {
    public interface AllowedType<T> { };

    internal object val;

    internal System.Type type;
}

static class UnionEx {
    public static T As<U,T>(this U x) where U : Union, Union.AllowedType<T> {
        return x.type == typeof(T) ?(T)x.val : default(T);
    }

    public static void Set<U,T>(this U x, T newval) where U : Union, Union.AllowedType<T> {
        x.val = newval;
        x.type = typeof(T);
    }

    public static bool Is<U,T>(this U x) where U : Union, Union.AllowedType<T> {
        return x.type == typeof(T);
    }
}

class MyType : Union, Union.AllowedType<int>, Union.AllowedType<string> {}

class TestIt
{
    static void Main()
    {
        MyType bla = new MyType();
        bla.Set(234);
        System.Console.WriteLine(bla.As<MyType,int>());
        System.Console.WriteLine(bla.Is<MyType,string>());
        System.Console.WriteLine(bla.Is<MyType,int>());

        bla.Set("test");
        System.Console.WriteLine(bla.As<MyType,string>());
        System.Console.WriteLine(bla.Is<MyType,string>());
        System.Console.WriteLine(bla.Is<MyType,int>());

        // compile time errors!
        // bla.Set('a'); 
        // bla.Is<MyType,char>()
    }
}

It could use some prettying-up. Especially, I couldn't figure out how to get rid of the type parameters to As/Is/Set (isn't there a way to specify one type parameter and let C# figure the other one?)

Amnon 2010-07-01 09:25:10

Answer 9

+4 A:

I don't really like the type-checking and type-casting solutions provided above, so here's 100% type-safe union which will throw compilation errors if you attempt to use the wrong datatype:

using System;

namespace Juliet
{
    class Program
    {
        static void Main(string[] args)
        {
            Union3<int, char, string>[] unions = new Union3<int,char,string>[]
                {
                    new Union3<int, char, string>.Case1(5),
                    new Union3<int, char, string>.Case2('x'),
                    new Union3<int, char, string>.Case3("Juliet")
                };

            foreach (Union3<int, char, string> union in unions)
            {
                string value = union.Match(
                    num => num.ToString(),
                    character => new string(new char[] { character }),
                    word => word);
                Console.WriteLine("Matched union with value '{0}'", value);
            }

            Console.ReadLine();
        }
    }

    public abstract class Union3<A, B, C>
    {
        public abstract T Match<T>(Func<A, T> f, Func<B, T> g, Func<C, T> h);

        public sealed class Case1 : Union3<A, B, C>
        {
            public readonly A Item;
            public Case1(A item) : base() { this.Item = item; }
            public override T Match<T>(Func<A, T> f, Func<B, T> g, Func<C, T> h)
            {
                return f(Item);
            }
        }

        public sealed class Case2 : Union3<A, B, C>
        {
            public readonly B Item;
            public Case2(B item) { this.Item = item; }
            public override T Match<T>(Func<A, T> f, Func<B, T> g, Func<C, T> h)
            {
                return g(Item);
            }
        }

        public sealed class Case3 : Union3<A, B, C>
        {
            public readonly C Item;
            public Case3(C item) { this.Item = item; }
            public override T Match<T>(Func<A, T> f, Func<B, T> g, Func<C, T> h)
            {
                return h(Item);
            }
        }
    }
}

Juliet 2010-07-07 22:39:13

Yup, if you want typesafe discriminated unions, you'll need `match`, and that's as good a way to get it as any.

Pavel Minaev 2010-07-07 23:02:48

And if all that boilerplate code gets you down, you can try this implementation which explicitly tags cases instead: http://pastebin.com/EEdvVh2R . Incidentally this style is very similar to the way F# and OCaml represent unions internally.

Juliet 2010-07-07 23:22:01

Interesting. I much prefer the version in pastebin - having to explicitly use Case1, 2, 3 above seemed redundant since it can be inferred from the type of the constructor argument.

Chris F 2010-07-08 07:05:00

Even though the Match function provides a way of replacing the "if(Is<A>) then do something" code in a typesafe manner I would still want accessors to get at the underlying item (similar to the As methods in my example), but I suppose I could just settle for 3 getters; AsA, AsB and AsC (using Match to access the items just seems too long winded). I would still prefer the As method to be generic as in my example (I just like the syntax) but I cannot have my cake and eat it I suppose.

Chris F 2010-07-08 07:06:05

I like Juliet's shorter code, but what if the types are <int, int, string>? How would you call the second constructor?

Robert Jeppesen 2010-09-27 13:08:17

@Robert Jeppensen: then you use the first code ;)

Juliet 2010-09-27 14:31:14

ansaurus

tags:

views:

answers:

Discriminated union in C#

related questions