How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?
Are there performance, semantic or any other benefits of choosing one over the other?
How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?
Are there performance, semantic or any other benefits of choosing one over the other?
I like to use a thought experiment:
Does this object make sense when only an empty constructor is called?
Edit at Richard E's request
A good use of struct
is to wrap primitives and scope them to valid ranges.
For example, probability has a valid range of 0-1. Using a decimal to represent this everywhere is prone to error and requires validation at every point of usage.
Instead, you can wrap a primitive with validation and other useful operations. This passes the thought experiment because most primitives have a natural 0 state.
Here is an example usage of struct
to represent probability:
public struct Probability : IEquatable<Probability>, IComparable<Probability>
{
public static bool operator ==(Probability x, Probability y)
{
return x.Equals(y);
}
public static bool operator !=(Probability x, Probability y)
{
return !(x == y);
}
public static bool operator >(Probability x, Probability y)
{
return x.CompareTo(y) > 0;
}
public static bool operator <(Probability x, Probability y)
{
return x.CompareTo(y) < 0;
}
public static Probability operator +(Probability x, Probability y)
{
return new Probability(x._value + y._value);
}
public static Probability operator -(Probability x, Probability y)
{
return new Probability(x._value - y._value);
}
private decimal _value;
public Probability(decimal value) : this()
{
if(value < 0 || value > 1)
{
throw new ArgumentOutOfRangeException("value");
}
_value = value;
}
public override bool Equals(object obj)
{
return obj is Probability && Equals((Probability) obj);
}
public override int GetHashCode()
{
return _value.GetHashCode();
}
public override string ToString()
{
return (_value * 100).ToString() + "%";
}
public bool Equals(Probability other)
{
return other._value.Equals(_value);
}
public int CompareTo(Probability other)
{
return _value.CompareTo(other._value);
}
public decimal ToDouble()
{
return _value;
}
public decimal WeightOutcome(double outcome)
{
return _value * outcome;
}
}
What is the cost of copying instances if passed by value.
If high, then immutable reference (class) type, otherwise value (struct) type.
There are a few things to consider:
A struct is allocated on the stack (usually). It is a value type, so passing the data around across methods can be costly if it is too large.
A class is allocated on the heap. It is a reference type, so passing the object around through methods is not as costly.
Generally, I use structs for immutable objects that are not very large. I only use them when there is a limited amount of data being held in them or I want immutability. An example is the DateTime
struct. I like to think that if my object is not as lightweight as something like a DateTime
, it is probably not worth being used as a struct. Also, if my object makes no sense being passed around as a value type (also like DateTime
), then it may not be useful to use as a struct. Immutability is key here though. Also, I want to stress that structs are not immutable by default. You have to make them immutable by design.
In 99% of situations I encounter, a class is the proper thing to use. I find myself not needing immutable classes very often. It's more natural for me to think of classes as mutable in most cases.
As a rule of thumb a struct size should not exceed 16 bytes, otherwise passing it between methods may become more expensive that passing object references, which are just 4 bytes (on a 32-bit machine) long.
Another concern is a default constructor. A struct always has a default (parameterless and public) constructor, otherwise the statements like
T[] array = new T[10]; // array with 10 values
would not work.
Additionally it's courteous for structs to overwrite the ==
and the !=
operators and to implement the IEquatable<T>
interface.
In today's world (I'm thinking C# 3.5) I do not see a need for structs (EDIT: Apart from in some niche scenarios).
The pro-struct arguments appear to be mostly based around perceived performance benefits. I would like to see some benchmarks (that replicate a real-world scenario) that illustrate this.
The notion of using a struct for "lightweight" data structures seems way too subjective for my liking. When does data cease to be lightweight? Also, when adding functionality to code that uses a struct, when would you decide to change that type to a class?
Personally, I cannot recall the last time I used a struct in C#.
I suggest that the use of a struct in C# for performance reasons is a clear case of Premature Optimization*
* unless the application has been performance profiled and the use of a class has been identified as a performance bottleneck
MSDN States:
The struct type is suitable for representing lightweight objects such as Point, Rectangle, and Color. Although it is possible to represent a point as a class, a struct is more efficient in some scenarios. For example, if you declare an array of 1000 Point objects, you will allocate additional memory for referencing each object. In this case, the struct is less expensive.
Unless you need reference type semantics, a class that is smaller than 16 bytes may be more efficiently handled by the system as a struct.
In general, I would not recommend structs for business objects. While you MIGHT gain a small amount of performance by heading this direction, as you are running on the stack, you end up limiting yourself in some ways and the default constructor can be a problem in some instances.
I would state this is even more imperative when you have software that is released to the public.
Structs are fine for simple types, which is why you see Microsoft using structs for most of the data types. In like manner, structs are fine for objects that make sense on the stack. The Point struct, mentioned in one of the answers, is a fine example.
How do I decide? I generally default to object and if it seems to be something that would benefit from being a struct, which as a rule would be a rather simple object that only contains simple types implemented as structs, then I will think it through and determine if it makes sense.
You mention an address as your example. Let's examine one, as a class.
public class Address
{
public string AddressLine1 { get; set; }
public string AddressLine2 { get; set; }
public string City { get; set; }
public string State { get; set; }
public string PostalCode { get; set; }
}
Think through this object as a struct. In the consideration, consider the types included inside this address "struct", if you coded it that way. Do you see anything that might not work out the way you want? Consider the potential performance benefit (ie, is there one)?
Factors: construction, memory requirements, boxing.
Normally, the constructor restrictions for structs - no explicit parameterless constructors, no base
construction - decides if a struct should be used at all. E.g. if the parameterless constructor should not initialize members to default values, use an immutable object.
If you still have the choice between the two, decide on memory requirements. Small items should be stored in structs especially if you expect many instances.
That benefit is lost when the instances get boxed (e.g. captured for an anonymous function or stored in a non-generic container) - you even start to pay extra for the boxing.
What is "small", what is "many"?
The overhead for an object is (IIRC) 8 bytes on a 32 bit system. Note that with a few hundred of instances, this may already decide whether or not an inner loop runs fully in cache, or invokes GC's. If you expect tens of thousands of instances, this may be the difference between run vs. crawl.
From that POV, using structs is NOT a premature optimization.
So, as rules of thumb:
If most instances would get boxed, use immutable objects.
Otherwise, for small objects, use an immutable object only if struct construction would lead to an awkward interface and you expect not more than thousands of instances.
From an object modeling perspective, I appreciate structs because they let me use the compiler to declare certain parameters and fields as non-nullable. Of course, without special constructor semantics (like in Spec#), this is only applicable to types that have a natural 'zero' value. (Hence Bryan Watt's 'though experiment' answer.)
How do you choose between implementing a value object (the canonical example being an address) as an immutable object or a struct?
I think your options are wrong. Immutable object and struct are not opposites, nor are they the only options. Rather, you've got four options:
I argue that in .NET, the default choice should be a mutable class to represent logic and an immutable class to represent an entity. I actually tend to choose immutable classes even for logic implementations, if at all feasible. Structs should be reserved for small types that emulate value semantics, e.g. a custom Date
type, a Complex
number type similar entities. The emphasis here is on small since you don't want to copy large blobs of data, and indirection through references is actually cheap (so we don't gain much by using structs). I tend to make structs always immutable (I can't think of a single exception at the moment). Since this best fits the semantics of the intrinsic value types I find it a good rule to follow.
Structs are strictly for advances users ( along with out and ref) .
Yes structs can give great performance when using ref but you have to see what memory they are using. Who controls the memory etc.
If your not using ref and outs with structs they are not worth it , if you are expect some nasty bugs :-)