views:

236

answers:

4

I was just doing some research on RedBlack Tree. I knew that SortedSet class in .Net 4.0 uses RedBlack tree. So I took that part out as is using Reflector and created a RedBlackTree class. Now I am running some perf test on this RedBlackTree and SortedSet inserting 40000 sequential integral values (starting from 0 to 39999), I am astonished to see that there is huge perf difference as follows:

 RBTree    took 9.27208   sec to insert 40000 values 
 SortedSet took 0.0253097 sec to insert 40000 values

What is the reason behind it? BTW I ran the test in Release configuration only and here is the small test code

            var stopWatch = new Stopwatch();
            var rbT = new RedBlackTree<int>();      
        stopWatch = new Stopwatch();
        stopWatch.Start();
        for (int i = 0; i < 40000; i++) {
            rbT.Add(i);
        }
        stopWatch.Stop();
        Console.WriteLine(stopWatch.Elapsed);

        var ss = new SortedSet<int>();
        stopWatch = new Stopwatch();
        stopWatch.Start();
        for (int i = 0; i < 40000; i++) {
            ss.Add(i);
        }
        stopWatch.Stop();
        Console.WriteLine(stopWatch.Elapsed);

Edit

I would like to share the code also for RBTree what I've extracted so that you also can run the diagnostics

public class Node<T>
    {
        public Node(){}

        public Node(T value)
        {
            Item = value;
        }       

        public Node(T value, bool isRed)
        {
            Item = value;
            IsRed = isRed;
        }

        public T Item;
        public Node<T> Left;
        public Node<T> Right;
        public Node<T> Parent;
        public bool IsRed;
    }

    public class RedBlackTree<T>
    {
        public RedBlackTree(){} 

        public Node<T> root;
        int count, version; 
        Comparer<T> comparer = Comparer<T>.Default;     

        public void Add(T item)
        {
            if (this.root == null)
            {
                this.root = new Node<T>(item, false);
                this.count = 1;
                this.version++;
                return;
            }

            Node<T> root = this.root;
            Node<T> node = null;
            Node<T> grandParent = null;
            Node<T> greatGrandParent = null;
            this.version++;

            int num = 0;
            while (root != null)
            {
                num = this.comparer.Compare(item, root.Item);
                if (num == 0)
                {
                    this.root.IsRed = false;
                    return;
                }
                if (Is4Node(root))
                {
                    Split4Node(root);
                    if (IsRed(node))
                    {
                        this.InsertionBalance(root, ref node, grandParent, greatGrandParent);
                    }
                }
                greatGrandParent = grandParent;
                grandParent = node;
                node = root;
                root = (num < 0) ? root.Left : root.Right;
            }
            Node<T> current = new Node<T>(item);
            if (num > 0)
            {
                node.Right = current;
            }
            else
            {
                node.Left = current;
            }
            if (node.IsRed)
            {
                this.InsertionBalance(current, ref node, grandParent, greatGrandParent);
            }
            this.root.IsRed = false;
            this.count++;
        }


        private static bool IsRed(Node<T> node)
        {
            return ((node != null) && node.IsRed);
        }

        private static bool Is4Node(Node<T> node)
        {
            return (IsRed(node.Left) && IsRed(node.Right));
        }

        private static void Split4Node(Node<T> node)
        {
            node.IsRed = true;
            node.Left.IsRed = false;
            node.Right.IsRed = false;
        }

        private void InsertionBalance(Node<T> current, ref Node<T> parent, Node<T> grandParent, Node<T> greatGrandParent)
        {
            Node<T> node;
            bool flag = grandParent.Right == parent;
            bool flag2 = parent.Right == current;
            if (flag == flag2)
            {
                node = flag2 ? RotateLeft(grandParent) : RotateRight(grandParent);
            }
            else
            {
                node = flag2 ? RotateLeftRight(grandParent) : RotateRightLeft(grandParent);
                parent = greatGrandParent;
            }
            grandParent.IsRed = true;
            node.IsRed = false;
            ReplaceChildOfNodeOrRoot(greatGrandParent, grandParent, node);
        }

        private static Node<T> RotateLeft(Node<T> node)
        {
            Node<T> right = node.Right;
            node.Right = right.Left;
            right.Left = node;
            return right;
        }

        private static Node<T> RotateRight(Node<T> node)
        {
            Node<T> left = node.Left;
            node.Left = left.Right;
            left.Right = node;
            return left;
        }

        private static Node<T> RotateLeftRight(Node<T> node)
        {
            Node<T> left = node.Left;
            Node<T> right = left.Right;
            node.Left = right.Right;
            right.Right = node;
            left.Right = right.Left;
            right.Left = left;
            return right;
        }

        private static Node<T> RotateRightLeft(Node<T> node)
        {
            Node<T> right = node.Right;
            Node<T> left = right.Left;
            node.Right = left.Left;
            left.Left = node;
            right.Left = left.Right;
            left.Right = right;
            return left;
        }

        private void ReplaceChildOfNodeOrRoot(Node<T> parent, Node<T> child, Node<T> newChild)
        {
            if (parent != null)
            {
                if (parent.Left == child)
                {
                    parent.Left = newChild;
                }
                else
                {
                    parent.Right = newChild;
                }
            }
            else
            {
                this.root = newChild;
            }
        }
    }

Edit


I ran the same diagnostic on some other data structure (some created by me*, some from .net framework**) and here is the interesting results

*AATree                 00:00:00.0309294
*AVLTree                00:00:00.0129743
**SortedDictionary      00:00:00.0313571
*RBTree                 00:00:09.2414156
**SortedSet             00:00:00.0241973

RBTree is the same as above (stripped out from SortedSet class). I tried with 400000 values also, but RBTree seems taking FOREVER, I really don't know why.

A: 

If the difference wasn't that big I would suggest that the cause is that the .NET assemblies are NGen-ed and so they are already translated to native code. In the case of your class the time to compile the IL code into native code is amortized over the time of your test. How does increasing the number of loop iterations affect the times?

Filip Navara
+1  A: 

SortedSet includes a TargetedPatchingOptOut attribute, did your copied version include that?

[TargetedPatchingOptOut("Performance critical to inline this type of method across NGen image boundaries")]
public bool Add(T item)
{
    return this.AddIfNotPresent(item);
}
Hightechrider
It has nothing to do with performance. As of MSDN itindicates that the .NET Framework class library method to which this attribute is applied is unlikely to be affected by servicing releases, and therefore is eligible to be inlined across Native Image Generator (NGen) images.
Anindya Chatterjee
I doubt the performance difference is due to method inlining across Native Image Generator (NGen) images.
dtb
@Anindya Chatterjee - The methods are small, so inlining them can play a significant role in the performance.
Filip Navara
+3  A: 
  1. reverse the order of the tests and repeat the measurement.
  2. randomize your data. Sorted sets behave strangely when you insert pre-sorted data.
Daniel Mošmondor
You are right! Randomize data give comparable results. The perf difference is negligible. But I am talking about a worst case scenario here. Why in this particular situation those two behave differently where SortedList is implemented by RBTree itself?
Anindya Chatterjee
So they were 2 totally different algorithms. Another goose chase.
Henk Holterman
+5  A: 

You have a bug in your Node<T> class. When you call the constructor that only takes a single value argument you should be setting IsRed to true.

I suppose that the fixed Node<T> class should look something like this:

public sealed class Node<T>
{
    public T Item { get; private set; }
    public bool IsRed { get; set; }
    public Node<T> Left { get; set; }
    public Node<T> Right { get; set; }

    public Node(T value)
    {
        Item = value;
        IsRed = true;
    }

    public Node(T value, bool isRed)
    {
        Item = value;
        IsRed = isRed;
    }
}

Another option -- my preference -- would be to omit that constructor altogether and always require IsRed to be set explicitly when you instantiate a new node:

public sealed class Node<T>
{
    public T Item { get; private set; }
    public bool IsRed { get; set; }
    public Node<T> Left { get; set; }
    public Node<T> Right { get; set; }

    public Node(T value, bool isRed)
    {
        Item = value;
        IsRed = isRed;
    }
}

And then replace this line in your Add method...

Node<T> current = new Node<T>(item);

...with this...

Node<T> current = new Node<T>(item, true);
LukeH
Oh! you the man. I just changed it and ran the diagnostic and guess what the respones time now is - 00.0204438 sec for the same set of data. Thanks a lot man, you save the day.
Anindya Chatterjee
Brilliant. A lot including me were speculating about NGen, Reverse Order, compilation issues... and it appears it came from the code :)
controlbreak