views:

233

answers:

6

Well, I guess it's clear the direction from which I am trying to reason about this. There is a lot of talk these days about the virtues of immutability (constness) whenever possible. The Concurrent programming in Java book talks a lot about this too.

However, all this is just what I've read about. I personally, haven't coded much in Functional languages. And it looks very surprising to me that it is possible to work comfortably with immutable objects. Theoretically, it is definitely possible. But, from a practical point of view, is a very comfortable experience. Or what is the new kind of reasoning (for FP) that I must develop so that I don't need mutability so much.

I would appreciate how to think about writing programs when you are forced to use immutable objects.

  • Already making this a Community Wiki ... for all those who are interested in closing this question or marking this as subjective ... etc. etc. ... *
+3  A: 

Immutability has, for example, benefits in muti-threaded programs. As immutable objects cannot change their state after construction, you can safe share them across any number of concurrently running threads, without the risk of one interfering (by altering the state of an object visible to other threads) with another.

Another advantage is, that it is easier to reason about the semantics of a program written in functional style (and thus, without side effects). Functional programming is by far more declarative in nature, emphasizing what the result should be and less, how it is to be achieved. Immutable data structures can help to make your program more functional in style.

Mark Chu-Carrol has a nice blog entry about that topic.

Dirk
A: 

Honestly, writing programs without mutable state is hard. The few times I have tried, the only way I've been able to think of it is that you want to copy rather than mutate. For example, instead of looping, you'd create a new stack frame with the values you want.

Imperative:

for(int num = 0; num < 10; num++) {
    doStuff(num);
}

Functional:

def loop(num) :
    doStuff(num)
    if(num < 10) :
        loop(num + 1)

In this case, you copy num on every iteration and change its value in the act of copying it.

For me it's a very uncomfortable experience. You can write programs without using mutable state in any high-level language. Functional languages take away a major option. That said, even when you're not in a concurrent environment, using immutability where you can do so without radically changing your programming style makes programs easier to reason about because you know what can and can't be modified.

dsimcha
The few times you have tried? It doesn't sound like you speak with much experience or authority on the matter.
Chuck
More idiomatic functional style uses pattern matching, which reads like this: `let rec loop = function 10 -> () | n -> doStuff(n); loop (n - 1)`. Very readable. Of course, most FP code doesn't require explicit looping as shown above. Most loops are abstracted away as follows: `[1 .. 10] |> List.iter (fun x -> doStuff(x))`
Juliet
Saying "functional languages take away a major option [mutable state]" sounds to me not much different than "Java takes away the option to use inline assembly". Yes, the option is gone, and if you only knew how to code by using that feature, then it feels awkward. But the limitation can be liberating -- especially to the poor schmuck who has to maintain your code.
Ken
Lack of inline assembly is one reason why I don't use Java. :) I don't want a language that believes in "one true way" of programming, no matter what that way is. I want a language that gives me choices so I can pick whatever style makes sense for the problem at hand.
dsimcha
BTW, you could more concisely write Juliet's example as `[1 .. 10] |> List.iter doStuff`, which is a lot more concise and clear than the imperative C loop IMO. And if the function is generating a permutation of the number, you would just change it to `[1 .. 10] |> List.map doStuff`, which compares even more favorably to the C version of that algorithm.
Chuck
+5  A: 

Immutability has several advantages, including (but not limited to):

  • Programs with immutable objects are less complicated to think about, since you don't need to worry about how an object may evolve over time.
  • You don't need to make defensive copies of immutable objects when returning or passing to other functions, since there is no possibility an immutable object will be modified behind your back.
  • One copy of an object is just as good as another, so you can cache objects or re-use the same object multiple times.
  • Immutable objects are good for sharing information between threads in a multi-threaded environment since they don't need to be synchronized.
  • Operations on immutable objects return new immutable objects while operations that cause side-effects on mutable objects usually return void. This means several operations can be chained together. For instance

("foo" + "bar" + "baz").length()

  • In languages where functions are first class values, operations like map, reduce, filter, etc. are basic operations on collections. These can be combined in many ways, and can replace most loops in a program.

There are of course some disadvantages:

  • Cyclic data structures such as graphs are difficult to build. If you have two objects which can't be modified after initialization, how can you get them to point to each other?
  • Allocating lots and lots of small objects rather than modifying ones you already have can have a performance impact. Usually the complexity of either the allocator or the garbage collector depends on the number of objects on the heap.
  • Naive implementations of immutable data structures can result in extremely poor performance. For instance, concatenating many immutable strings (like in Java) is O(n2) when the best algorithm is O(n). It is possible to write efficient immutable data structures, it just takes a little more thought.
Jay Conrod
I personally find immutable data structures easier to write than their mutable variants. Regarding strings, they're just a crappy data structure. Immutable Ropes ( http://en.wikipedia.org/wiki/Rope_(computer_science) ) provide O(1) concats and log(n) substrings.
Juliet
How is the last point on map/reduce/filter a pro for immutability? Seems that's just as true with mutable objects as immutable.
Roger Pate
it's showing that you don't, in many cases, actually need to write the loop yourself. That this only works well in languages that allow first class functions (and normally closures) means that few largely imperative languages allow it to work well but many imperative languages now support these constructs natively (mainly due to GC being more widespread as implementing non stack bound closures without this is painful)
ShuggyCoUk
@Roger Pate, map/reduce/filter replaces the mutable way of doing things: writing a loop that modifies a collection in place (map), that accumulates some value by mutating it (reduce), or that mutates a collection by removing some of its elements (filter).
Jay Conrod
+2  A: 

Many functional languages are non pure (allow mutation and side effects).

f# is for example, and if you look at some of the very low level constructs in the collections you'll find that several use iteration under the hood and quite a few use some mutable state (if you want to take the first n elements of a sequence it's so much easier to have a counter for example).

The trick is that this is something to generally:

  1. Use sparingly
  2. Draw attention to when you do
    • note how in f# you must declare something to be mutable

That it is possible to largely avoid mutating state is evidenced by the large amount of functional code out there. For people brought up on imperative languages this is somewhat hard to get your head round, especially writing code previously in loops as recursive functions. Even trickier is then writing them, where possible, as tail recursive. Knowing how to do this is beneficial and can lead to far more expressive solutions that focus on the logic rather than the implementation. Good examples are those that deal with collections where the 'base cases' of having no, one or many elements are cleanly expressed rather than being part of the loop logic.

It is really 2 though that things are better. And this is best done via an example:

Take your code base and change every instance variable to readonly[1][2]. Change back only those ones where you need them to be mutable for your code to function (if you only set them once outside the constructor consider trying to make them arguments to the constructor rather than mutable via something like a property.

There are some code bases this will not work well with, gui/widget heavy code and some libraries (notably mutable collections) for example but I would say that most reasonable code will allow over 50% of all instance fields to be made readonly.

At this point you must ask yourself, "why is mutable the default?". Mutable fields are in fact a complex aspect of your program as their interactions, even in a single threaded world, have far more scope for differing behaviour; as such they are best highlighted and drawn to the attention of the coder rather than left 'naked' to the ravages of the world.

It is notable that most functional languages have either no concept of null or make it very hard to use because they work, not with variables, but with named values whose value is defined at the same time (well scope) the name is.


  1. I find it unfortunate that c# did not copy java's concept of immutability with local variables too. Being able to assert emphatically that something doesn't change helps make intent clear whether a value is on the stack or in an object/struct.

  2. If you have NDepend then you can find these with WARN IF Count > 0 IN SELECT FIELDS WHERE IsImmutable AND !IsInitOnly

ShuggyCoUk
Instead of null, most strongly functional languages have something called an "option type". It's a type with two constructors — one that contains a value (e.g., `Some "Hello"` in OCaml and F#) and one that contains no value, which is equivalent to null. Then you can pattern match like `match someNullableValue with Some x -> x | None -> "<No Value Set>"`. This has the advantage of making it explicit where nulls can occur, eliminating those annoying errors we all know and love.
Chuck
indeed, they make it explicit, it's still not null though. Option<T> is strongly typed so you can still reflect on it as well as ToString it without worrying about NullReferenceException for example.
ShuggyCoUk
True, and in such statically typed languages, if you try to pass an Option<T> where a T is expected, that's an easily caught compile-time type error rather than a NullReferenceException that makes your app go boom at random.
Chuck
I really like how Option<T> works with anything, unlike Nullable that only allows reference types. I see why they did this but it's means putting the sort of null checking possible from spec# into c# proper will be that much more verbose.
ShuggyCoUk
+1  A: 

With immutable data structures, it is more feasible for operations on the data structure to share structure, allowing copies to be cheap. (Clojure does this)

Using recursion with immutable data structures works well, since you would be passing you data structure into the recursive call anyways. First class functions help to factor out the nitty-gritty details.

Brian
+3  A: 

But, from a practical point of view, is a very comfortable experience.

I like to use F# for most of my functional programming. For what its worth, you can write functional code in C#, its just really really nasty and ugly to read. Additionally, I find GUI development resists a functional programming style.

Fortunately, business code seems to adapt really well to functional style :) That, and web development too -- think about, each HTTP request is stateless. Each time you "modify" state, you pass the server some state, and it returns back an entirely new page.

I would appreciate how to think about writing programs when you are forced to use immutable objects.

Immutable objects should be small

For the most part, I find immutable data structures easiest to work with when the objects have fewer than 3 or 4 intrinsic properties. For example, each node in a red-black tree has 4 properties: a color, a value, a left child, and right-child. A stack has two properties, a value and a pointer to the next stack node.

Consider your company's database, you might have tables with 20, 30, 50 properties. If you need to modify those objects throughout your application, then I'd definitely resist the urge to make those immutable.

C# / Java / C++ aren't good functional languages. Use Haskell, OCaml, or F# instead

In my own experience, immutable objects are 1000x easier to read and write in ML-like languages than C-like languages. I'm sorry, but once you have pattern matching and union types, you can't give them up :) Additionally, some data structures can take advantage of tail-call optimization, a feature you just don't get in some C-like languages.

But just for fun, here's an unbalanced binary tree in C#:

class Tree<T> where T : IComparable<T>
{
    public static readonly ITree Empty = new Nil();

    public interface ITree
    {
        ITree Insert(T value);
        bool Exists(T value);
        T Value { get; }
        ITree Left { get; }
        ITree Right { get; }
    }

    public sealed class Node : ITree
    {
        public Node(T value, ITree left, ITree right)
        {
            this.Value = value;
            this.Left = left;
            this.Right = right;
        }

        public ITree Insert(T value)
        {
            switch(value.CompareTo(this.Value))
            {
                case 0 : return this;
                case -1: return new Node(this.Value, this.Left.Insert(value), this.Right);
                case 1: return new Node(this.Value, this.Left, this.Right.Insert(value));
                default: throw new Exception("Invalid comparison");
            }
        }

        public bool Exists(T value)
        {
            switch (value.CompareTo(this.Value))
            {
                case 0: return true;
                case -1: return this.Left.Exists(value);
                case 1: return this.Right.Exists(value);
                default: throw new Exception("Invalid comparison");
            }
        }

        public T Value { get; private set; }
        public ITree Left { get; private set; }
        public ITree Right { get; private set; }
    }

    public sealed class Nil : ITree
    {
        public ITree Insert(T value)
        {
            return new Node(value, new Nil(), new Nil());
        }

        public bool Exists(T value) { return false; }

        public T Value { get { throw new Exception("Empty tree"); } }
        public ITree Left { get { throw new Exception("Empty tree"); } }
        public ITree Right { get { throw new Exception("Empty tree"); } }
    }
}

The Nil class represents an empty tree. I prefer this representation over the null representation because null checks are the devil incarnate :)

Whenever we add a node, we create a brand new tree with the node inserted. This is more efficient than it sounds, because we don't need to copy all the nodes in the tree; we only need to copy nodes "on the way down" and reuse any nodes which haven't changed.

Let's say we have a tree like this:

       e
     /   \
    c     s
   / \   / \
  a   b f   y

Alright, now we want to insert w into the list. We're going to start at the root e, move to s, then to y, then replace y's left child with a w. We need to create copy of the nodes on the way down:

       e                     e[1]
     /   \                 /   \
    c     s      --->     c     s[1]
   / \   / \             / \    /\  
  a   b f   y           a   b  f  y[1]
                                  /
                                 w

Alright, now we insert a g:

       e                     e[1]                   e[2]
     /   \                 /   \                   /    \
    c     s      --->     c     s[1]      -->     c      s[2]
   / \   / \             / \    /\               / \     /   \
  a   b f   y           a   b  f  y[1]          a   b  f[1]   y[1]
                                  /                      \    /
                                 w                        g  w

We reuse all of the old nodes in the tree, so there's no reason to rebuild the entire tree from scratch. This tree has the same computational complexity as its mutable counterpart.

Its also pretty easy to write immutable versions of red-black trees, AVL trees, tree-based heaps, and many other data structures.

Juliet
+1 I edited the grammar, hope you don't mind
ShuggyCoUk