views:

878

answers:

3

I'm trying to learn about catamorphisms and I've read the Wikipedia article and the first couple posts in the series of the topic for F# on the Inside F# blog.

I understand that it's a generalization of folds (i.e., mapping a structure of many values to one value, including a list of values to another list). And I gather that the fold-list and fold-tree is a canonical example.

Can this be shown to be done in C#, using LINQ's Aggregate operator or some other higher-order method?

+4  A: 

I've been doing more reading, including a Micorosft Research paper on functional programming with catamorphisms ("bananas"), and it seems that catamorphism just refers to any function that takes a list and typically breaks it down to a single value (IEnumerable<A> => B), like Max(), Min(), and in the general case, Aggregate(), would all be a catamorphisms for lists.

I was previously under the impression that it refered to a way of creating a function that can generalize different folds, so that it can fold a tree and a list. There may actually still be such a thing, some kind of functor or arrow maybe but right now that's beyond my level of understanding.

Mark Cidade
Why did you ask a question and then just answer it immediately yourself? Obviously you already know the answer.
Phil Wright
I didn't know the answer when I posted the question and didn't anticipate that I'd come across something short after that that would give me the answer. I still don't know if that's all there is to it. StackOverflow users are encouraged to answer their own question.
Mark Cidade
+7  A: 

LINQ's Aggregate() is just for IEnumerables. Catamorphisms in general refer to the pattern of folding for an arbitrary data type. So Aggregate() is to IEnumerables what FoldTree (below) is to Trees (below); both are catamorphisms for their respective data types.

I translated some of the code in part 4 of the series into C#. The code is below. Note that the equivalent F# used three less-than characters (for generic type parameter annotations), whereas this C# code uses more than 60. This is evidence why no one writes such code in C# - there are too many type annotations. I present the code in case it helps people who know C# but not F# play with this. But the code is so dense in C#, it's very hard to make sense of.

using System;
using System.Collections.Generic;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Shapes;

class Tree<T>   // use null for Leaf
{
    public T Data { get; private set; }
    public Tree<T> Left { get; private set; }
    public Tree<T> Right { get; private set; }
    public Tree(T x, Tree<T> l, Tree<T> r)
    {
        this.Data = x;
        this.Left = l;
        this.Right = r;
    }
}
class Tree
{
    public static Tree<T> Node<T>(T x, Tree<T> l, Tree<T> r) { return new Tree<T>(x, l, r); }
    public static Tree<int> Tree7 = Node(4, Node(2, Node(1, null, null), Node(3, null, null)),
                                            Node(6, Node(5, null, null), Node(7, null, null)));
    public static R XFoldTree<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> tree)
    {
        return Loop(nodeF, leafV, tree, x => x);
    }
    public static R Loop<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> t, Func<R, R> cont)
    {
        if (t == null)
            return cont(leafV(t));
        else
            return Loop(nodeF, leafV, t.Left, lacc =>
                   Loop(nodeF, leafV, t.Right, racc =>
                   cont(nodeF(t.Data, lacc, racc, t))));
    }
    public static R FoldTree<A, R>(Func<A, R, R, R> nodeF, R leafV, Tree<A> tree)
    {
        return XFoldTree((x, l, r, _) => nodeF(x, l, r), _ => leafV, tree);
    }
    public static Func<Tree<A>, Tree<A>> XNode<A>(A x, Tree<A> l, Tree<A> r)
    {
        return (Tree<A> t) => x.Equals(t.Data) && l == t.Left && r == t.Right ? t : Node(x, l, r);
    }
    // DiffTree: Tree<'a> * Tree<'a> -> Tree<'a * bool> 
    // return second tree with extra bool 
    // the bool signifies whether the Node "ReferenceEquals" the first tree 
    public static Tree<KeyValuePair<A, bool>> DiffTree<A>(Tree<A> tree, Tree<A> tree2)
    {
        return XFoldTree((A x, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> l, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> r, Tree<A> t) => (Tree<A> t2) =>
            Node(new KeyValuePair<A, bool>(t2.Data, object.ReferenceEquals(t, t2)),
                 l(t2.Left), r(t2.Right)),
            x => y => null, tree)(tree2);
    }
}
class Example
{
    // original version recreates entire tree, yuck 
    public static Tree<int> Change5to0(Tree<int> tree)
    {
        return Tree.FoldTree((int x, Tree<int> l, Tree<int> r) => Tree.Node(x == 5 ? 0 : x, l, r), null, tree);
    }
    // here it is with XFold - same as original, only with Xs 
    public static Tree<int> XChange5to0(Tree<int> tree)
    {
        return Tree.XFoldTree((int x, Tree<int> l, Tree<int> r, Tree<int> orig) =>
            Tree.XNode(x == 5 ? 0 : x, l, r)(orig), _ => null, tree);
    }
}
class MyWPFWindow : Window 
{
    void Draw(Canvas canvas, Tree<KeyValuePair<int, bool>> tree)
    {
        // assumes canvas is normalized to 1.0 x 1.0 
        Tree.FoldTree((KeyValuePair<int, bool> kvp, Func<Transform, Transform> l, Func<Transform, Transform> r) => trans =>
        {
            // current node in top half, centered left-to-right 
            var tb = new TextBox();
            tb.Width = 100.0; 
            tb.Height = 100.0;
            tb.FontSize = 70.0;
                // the tree is a "diff tree" where the bool represents 
                // "ReferenceEquals" differences, so color diffs Red 
            tb.Foreground = (kvp.Value ? Brushes.Black : Brushes.Red);
            tb.HorizontalContentAlignment = HorizontalAlignment.Center;
            tb.VerticalContentAlignment = VerticalAlignment.Center;
            tb.RenderTransform = AddT(trans, TranslateT(0.25, 0.0, ScaleT(0.005, 0.005, new TransformGroup())));
            tb.Text = kvp.Key.ToString();
            canvas.Children.Add(tb);
            // left child in bottom-left quadrant 
            l(AddT(trans, TranslateT(0.0, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
            // right child in bottom-right quadrant 
            r(AddT(trans, TranslateT(0.5, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
            return null;
        }, _ => null, tree)(new TransformGroup());
    }

    public MyWPFWindow(Tree<KeyValuePair<int, bool>> tree)
    {
        var canvas = new Canvas();
        canvas.Width=1.0;
        canvas.Height=1.0;
        canvas.Background = Brushes.Blue;
        canvas.LayoutTransform=new ScaleTransform(200.0, 200.0);
        Draw(canvas, tree);
        this.Content = canvas;
        this.Title = "MyWPFWindow";
        this.SizeToContent = SizeToContent.WidthAndHeight;
    }
    TransformGroup AddT(Transform t, TransformGroup tg) { tg.Children.Add(t); return tg; }
    TransformGroup ScaleT(double x, double y, TransformGroup tg) { tg.Children.Add(new ScaleTransform(x,y)); return tg; }
    TransformGroup TranslateT(double x, double y, TransformGroup tg) { tg.Children.Add(new TranslateTransform(x,y)); return tg; }

    [STAThread]
    static void Main(string[] args)
    {
        var app = new Application();
        //app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7,Example.Change5to0(Tree.Tree7))));
        app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7, Example.XChange5to0(Tree.Tree7))));
    }
}
Brian
My eyes! But seriously I've tried using an example like this while comparing it to some F# for my coworkers and their response was why would I write this in the first place.
gradbot
Exactly. When all you know is C#, the very notion that computing the height of a tree and transforming a tree and drawing a tree on a WPF Canvas--that these disparate tasks--share a common core, feels crazy. And the code for ``Tree`` just strengthens that case, being so impenetrable. But then you go code them in F#, and individual one-off implementations have the stink of DRY (don't repeat yourself), and the commonality is there, manifest, and the language lets you capture the essense of the commonality. And it's like crack, and you wonder what other abstractions you've been missing all along.
Brian
+2  A: 

I understand that it's a generalization of folds (i.e., mapping a structure of many values to one value, including a list of values to another list).

I wouldn't say one value.It maps it into another structure.

Maybe an example would clarify.let's say summation over a list.

foldr (\x -> \y -> x + y) 0 [1,2,3,4,5]

Now this would reduce to 15. But actually,it can be viewed mapping to a purely syntactic structure 1 + 2 + 3 + 4 + 5 + 0. It is just that the programming language(in the above case,haskell) knows how to reduce the above syntactic structure to 15.

Basically,a catamorphism replaces one data constructor with another one.In case of above list,

[1,2,3,4,5] = 1:2:3:4:5:[] (: is the cons operator,[] is the nil element) the catamorphism above replaced : with + and [] with 0.

It can be generalized to any recursive datatypes.

Thiagarajan