views:

1162

answers:

10

How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?

Are there performance, semantic or any other benefits of choosing one over the other?

+6  A: 

I like to use a thought experiment:

Does this object make sense when only an empty constructor is called?

Edit at Richard E's request

A good use of struct is to wrap primitives and scope them to valid ranges.

For example, probability has a valid range of 0-1. Using a decimal to represent this everywhere is prone to error and requires validation at every point of usage.

Instead, you can wrap a primitive with validation and other useful operations. This passes the thought experiment because most primitives have a natural 0 state.

Here is an example usage of struct to represent probability:

public struct Probability : IEquatable<Probability>, IComparable<Probability>
{
    public static bool operator ==(Probability x, Probability y)
    {
        return x.Equals(y);
    }

    public static bool operator !=(Probability x, Probability y)
    {
        return !(x == y);
    }

    public static bool operator >(Probability x, Probability y)
    {
        return x.CompareTo(y) > 0;
    }

    public static bool operator <(Probability x, Probability y)
    {
        return x.CompareTo(y) < 0;
    }

    public static Probability operator +(Probability x, Probability y)
    {
        return new Probability(x._value + y._value);
    }

    public static Probability operator -(Probability x, Probability y)
    {
        return new Probability(x._value - y._value);
    }

    private decimal _value;

    public Probability(decimal value) : this()
    {
        if(value < 0 || value > 1)
        {
            throw new ArgumentOutOfRangeException("value");
        }

        _value = value;
    }

    public override bool Equals(object obj)
    {
        return obj is Probability && Equals((Probability) obj);
    }

    public override int GetHashCode()
    {
        return _value.GetHashCode();
    }

    public override string ToString()
    {
        return (_value * 100).ToString() + "%";
    }

    public bool Equals(Probability other)
    {
        return other._value.Equals(_value);
    }

    public int CompareTo(Probability other)
    {
        return _value.CompareTo(other._value);
    }

    public decimal ToDouble()
    {
        return _value;
    }

    public decimal WeightOutcome(double outcome)
    {
        return _value * outcome;
    }
}
Bryan Watts
I don't understand. You can create objects that don't have a default constructor.
Garry Shutler
Structs *always* have a default constructor, even if you don't define one. Therefore, a struct can always be instantiated to an "empty" instance (such as new Int32()). If a the object doesn't make sense without a particular constructor, it should probably be an immutable class.
Bryan Watts
In C# classes also have a default constructor, even if none is declared.
Richard Ev
What I meant is that the default constructor can be hidden (made private for example) if required.
Garry Shutler
But if you declare a non-default constructor in a class, the default one goes away and you can *only* use the non-default. With structs, the default is always there and cannot be removed.
Bryan Watts
@Garry Shutler: you cannot declare a default constructor in a struct (and therefore can't set its visibility). There is *no way* to prevent someone from using a default constructor with a struct.
Bryan Watts
I was replying to Richard E with regard to the default constructors for classes but I didn't make that clear. I realise structs have to have a default constructor.
Garry Shutler
True, the default constructor for a class is removed if you declare a parameterised constructor. I'm not sure what that proves about the struct vs class discussion though.
Richard Ev
The point I made is that since structs always have default constructors, and that makes them different from classes, contemplating that difference and its connotations is a good way to think about the decision. If a "zeroed-out" instance makes sense, a struct is a possibility.
Bryan Watts
A: 

What is the cost of copying instances if passed by value.

If high, then immutable reference (class) type, otherwise value (struct) type.

Richard
I am not sure that perceived performance benefits of a class vs struct is really relevant any more.
Richard Ev
Consider System.String. With a few bytes you are right, but when that starts crossing many cache lines the costs start adding up.
Richard
So a struct should never contain a property of type string?
Richard Ev
Mind you, a string is a reference type anyway...
Richard Ev
Rather consider /implementing/ a class like string that you want immutable, but directly contains significant data.
Richard
PS. recommendation from MS is value types shouldn't exceed 16 bytes.
Richard
+11  A: 

There are a few things to consider:

A struct is allocated on the stack (usually). It is a value type, so passing the data around across methods can be costly if it is too large.

A class is allocated on the heap. It is a reference type, so passing the object around through methods is not as costly.

Generally, I use structs for immutable objects that are not very large. I only use them when there is a limited amount of data being held in them or I want immutability. An example is the DateTime struct. I like to think that if my object is not as lightweight as something like a DateTime, it is probably not worth being used as a struct. Also, if my object makes no sense being passed around as a value type (also like DateTime), then it may not be useful to use as a struct. Immutability is key here though. Also, I want to stress that structs are not immutable by default. You have to make them immutable by design.

In 99% of situations I encounter, a class is the proper thing to use. I find myself not needing immutable classes very often. It's more natural for me to think of classes as mutable in most cases.

Dan Herbert
One of the benefits you missed is that structs enjoy pass-by-value semantics. (Objects are pass-by-reference, while object-references are pass-by-value, for all you finicky people out there.)
Justice
That was the first point I implied, when I said "[a struct] is a value type" and "[a class] is a reference type".
Dan Herbert
A: 

As a rule of thumb a struct size should not exceed 16 bytes, otherwise passing it between methods may become more expensive that passing object references, which are just 4 bytes (on a 32-bit machine) long.

Another concern is a default constructor. A struct always has a default (parameterless and public) constructor, otherwise the statements like

T[] array = new T[10]; // array with 10 values

would not work.

Additionally it's courteous for structs to overwrite the == and the != operators and to implement the IEquatable<T> interface.

Michael Damatov
+2  A: 

In today's world (I'm thinking C# 3.5) I do not see a need for structs (EDIT: Apart from in some niche scenarios).

The pro-struct arguments appear to be mostly based around perceived performance benefits. I would like to see some benchmarks (that replicate a real-world scenario) that illustrate this.

The notion of using a struct for "lightweight" data structures seems way too subjective for my liking. When does data cease to be lightweight? Also, when adding functionality to code that uses a struct, when would you decide to change that type to a class?

Personally, I cannot recall the last time I used a struct in C#.

Edit

I suggest that the use of a struct in C# for performance reasons is a clear case of Premature Optimization*

* unless the application has been performance profiled and the use of a class has been identified as a performance bottleneck

Edit 2

MSDN States:

The struct type is suitable for representing lightweight objects such as Point, Rectangle, and Color. Although it is possible to represent a point as a class, a struct is more efficient in some scenarios. For example, if you declare an array of 1000 Point objects, you will allocate additional memory for referencing each object. In this case, the struct is less expensive.

Unless you need reference type semantics, a class that is smaller than 16 bytes may be more efficiently handled by the system as a struct.

Richard Ev
Have you used an Int32 or DateTime lately? Those are pretty good reasons to have a struct :-) "Class vs struct" is the same concept as "entity vs value", expressed in language terms. The difference is around identity, *not* perceived performance benefits.
Bryan Watts
Never define your own struct, and you will live a longer and happier life. Always uses classes (except for interop field layout).
Brian
Bryan - can you clarify which point above you are referring to in your comment?
Richard Ev
Bryan - my point was twofold:1) Some of the respondents cited performance as a reason for using a struct2) Others suggested that it should be used for lightweight constructs.I feel that both of these viewpoints are very subjective and would benefit from clarification.
Richard Ev
@Richard E: I was referring to @Brian's comment. I decided to remove it since it was so easily misinterpreted. @Brian: while that makes life less complicated, it also removes a powerful tool from your toolbox.
Bryan Watts
I agree with both points, that performance should not be a primary factor. Structs have their place when used correctly; they simply require diligence. My favorite use is for numbers with valid ranges outside those of the language type. For example, probability is 0-1, a percentage is 0-100, etc.
Bryan Watts
premature optimization is evil only if you compromise anything else for it - e.g. readability, interface elegance, development time.
peterchen
Updated my answer.
Bryan Watts
A: 

In general, I would not recommend structs for business objects. While you MIGHT gain a small amount of performance by heading this direction, as you are running on the stack, you end up limiting yourself in some ways and the default constructor can be a problem in some instances.

I would state this is even more imperative when you have software that is released to the public.

Structs are fine for simple types, which is why you see Microsoft using structs for most of the data types. In like manner, structs are fine for objects that make sense on the stack. The Point struct, mentioned in one of the answers, is a fine example.

How do I decide? I generally default to object and if it seems to be something that would benefit from being a struct, which as a rule would be a rather simple object that only contains simple types implemented as structs, then I will think it through and determine if it makes sense.

You mention an address as your example. Let's examine one, as a class.

public class Address
{
    public string AddressLine1 { get; set; }
    public string AddressLine2 { get; set; }
    public string City { get; set; }
    public string State { get; set; }
    public string PostalCode { get; set; }
}

Think through this object as a struct. In the consideration, consider the types included inside this address "struct", if you coded it that way. Do you see anything that might not work out the way you want? Consider the potential performance benefit (ie, is there one)?

Gregory A Beamer
Applying the thought experiment from my answer to your example: does an address make sense if all its fields are null? I would say no in this case.
Bryan Watts
Realistically, with data objects, you do not create an object until at least one item is not null, but I would agree that an object with 100% nulls is invalid. The main point I was making is a struct is not the best method to code an address.
Gregory A Beamer
+1  A: 

Factors: construction, memory requirements, boxing.

Normally, the constructor restrictions for structs - no explicit parameterless constructors, no base construction - decides if a struct should be used at all. E.g. if the parameterless constructor should not initialize members to default values, use an immutable object.

If you still have the choice between the two, decide on memory requirements. Small items should be stored in structs especially if you expect many instances.

That benefit is lost when the instances get boxed (e.g. captured for an anonymous function or stored in a non-generic container) - you even start to pay extra for the boxing.


What is "small", what is "many"?

The overhead for an object is (IIRC) 8 bytes on a 32 bit system. Note that with a few hundred of instances, this may already decide whether or not an inner loop runs fully in cache, or invokes GC's. If you expect tens of thousands of instances, this may be the difference between run vs. crawl.

From that POV, using structs is NOT a premature optimization.


So, as rules of thumb:

If most instances would get boxed, use immutable objects.
Otherwise, for small objects, use an immutable object only if struct construction would lead to an awkward interface and you expect not more than thousands of instances.

peterchen
A: 

From an object modeling perspective, I appreciate structs because they let me use the compiler to declare certain parameters and fields as non-nullable. Of course, without special constructor semantics (like in Spec#), this is only applicable to types that have a natural 'zero' value. (Hence Bryan Watt's 'though experiment' answer.)

Abraham Pinzur
+4  A: 

How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?

I think your options are wrong. Immutable object and struct are not opposites, nor are they the only options. Rather, you've got four options:

  • Class
    • mutable
    • immutable
  • Struct
    • mutable
    • immutable

I argue that in .NET, the default choice should be a mutable class to represent logic and an immutable class to represent an entity. I actually tend to choose immutable classes even for logic implementations, if at all feasible. Structs should be reserved for small types that emulate value semantics, e.g. a custom Date type, a Complex number type similar entities. The emphasis here is on small since you don't want to copy large blobs of data, and indirection through references is actually cheap (so we don't gain much by using structs). I tend to make structs always immutable (I can't think of a single exception at the moment). Since this best fits the semantics of the intrinsic value types I find it a good rule to follow.

Konrad Rudolph
What do you mean by "represent logic"?Could you give an example?Many entities (such as a person) are not defined by there properties so why do you argue they should be immutable?
gkdm
@Avid: int (System.Int32), bool (System.Boolean), double (System.Double), etc are all immutable. Unlike in old versions of FORTRAN, you cannot change the value of `3` in C#. If you have `readonly int x = 3;`, `x` will always have the value 3, and you can't change that value. However, if you have `struct Point { int x; int y; }` and `readonly Point a = new Point();` you can change the value contained in a: `a.x = 42;`.
Martinho Fernandes
@Martinho, in that case the immutability is a result of the "readonly", not intrinsic to int. In fact, if it was 'int x = 3;' you can absolutely change the value of 'x'. '3' on the other hand, is a constant int, not a regular variable.
AviD
@Avid: Notice I used readonly on *both* cases. Yet they differ. I never reassigned the variable `a` in the second example. I simply *mutated* it. You acknowledged yourself that strings are immutable. So if I have `string s = "foo";` I can absolutely change the value of `s`, and that does not make strings mutable. 3 is immutable. Ints are immutable. Strings are immutable. Mutable *variables* are mutable. Immutable *variables* (aka readonly) are immutable. Somewhat related: http://devnonsense.blogspot.com/2009/11/immutable-data-is-thread-safe.html
Martinho Fernandes
@Martinho, I think we're talking at different levels of indirection: Wrt strings, if you change the value of s, the old string object is *thrown out* (and then GC'd), while a *completely new instance of System.String* is created for you with the new value. So, yes, you changed the "value" of s only insomuch as you're talking about the reference - i.e. you changed the address that s is now pointing to, but the original string instance *was not changed*. This is not the case with a System.Integer32 instance, where `int i = 1; i = 3;` keeps the original instance, changing only *its value*.
AviD
@Avid: Why is the case with int different? You can see `int i = 1; i = 3;` as throwing the 1 out and replacing it with an instance of the number 3. Think about `int i = 1; i = new int();`. Is it changing the existing value, or replacing it with a new instance of System.Int32? The fact that you can assign values to variables doesn't make a type immutable or mutable. It's the (in)existence of mutable fields and/or mutator methods, which neither string nor int have. Also, MSDN states clearly that Int32 is an immutable type.
Martinho Fernandes
@AviD You are confusing immutability of the value with immutability of the variable containing the value. Yes, you can mutate variables of type integer. But you can't mutate the underlying integers. You can't increase the value of 3. Conversely, you can both mutate variables of type list and mutate the underlying list.
Strilanc
@Strilanc, it is you and @Martinho that are confusing it. Or it might just be we have different meanings for "immutability". For me, if an object is immutable, you cannot change its value in-place. If you try to change the value, copy-on-write semantics are used, to replace the object with a new object, and change the original variable to refer to the new object. This doesnt have to be through mutable fields or mutator methods, it can be just a change to the *value* of the object itself.
AviD
@AviD: This discussion is moot. You are confusing value with symbol. These have a very well-defined meaning in computer science, shame on Microsoft for being sloppy with terminology. But the fact is: a variable is a *symbol*, and unless you declare that `const` or `readonly` in .NET, a variable is always mutable. A value, on the other hand, is something completely different. And values in .NET are immutable for all basic types. You cannot change a value, end of story. All you can do is assign a new value to an existing symbol. (continued …)
Konrad Rudolph
@Avid: (cont’d) You admitted yourself that strings are immutable. Well, how does `s += "foo"` differ conceptually from `i += 1`? Not at all, that’s how. In both cases you read the value of the symbol/variable, throw it out and replace it with a new symbol. The symbol is mutable in both cases, the value is not.
Konrad Rudolph
@Konrad, it's not the value OR the symbol that is mutable or not, it's the *object*. `s += "foo"` is *very* different from `i+= 1` when you realize that it's not s or i you want to change, but the *object* it's referring to. For s, you can't, so you have to assign the variable to refer to a new object. For i, you assign the same object a new value, and therefor you dont need to change i's reference. (continued...)
AviD
(contd) Where I think we are miscommunicating, is close to what you are saying the confusion, but allow me to clarify: When we say "variable", are we talking about the symbol, or the object to which it refers? Well, truthfully, usually depends on the context, which could lead to some confusion. For my part, in this discussion, I was referring to the symbol. I don't think MS were sloppy here... However, there are some situations, such as the int above (and I was purposefully obtuse in my previous comment), where they are one and the same. But my point still stands.
AviD
@AviD: The difference that you want to see between `s += "foo"` and `i += 1` doesn’t exist: in both cases it’s *not* the object that you are mutating, it’s the variable (= the symbol). For integers, there is no way of seeing the difference since they are value types and thus no two symbols can refer to the *same* object anyway (thus we cannot observe a different behaviour). But conceptually, Microsoft makes it very clear that the basic value types (comprising `int`) are immutable. Note that this is different from, say, C++ where you *can* mutate basic types (and you *can* see that difference).
Konrad Rudolph
@Konrad - right, that's what I meant that with int they are one and the same, and thus int (and other basic valuetypes) must be immutable. With strings, it should be different, since they are references - but that's why they are considered immutable, even though the mechanics underneath are very different. Truthfully, I don't even remember what we disagreed about, since we seem to be in agreement (if from a different PoV), though it probably started from some misunderstanding of mine... Thanks.
AviD
A: 

Structs are strictly for advances users ( along with out and ref) .

Yes structs can give great performance when using ref but you have to see what memory they are using. Who controls the memory etc.

If your not using ref and outs with structs they are not worth it , if you are expect some nasty bugs :-)

ben