views:

319

answers:

3

For my Algorithms & Data Structures class, I've been tasked with implementing a splay tree in Haskell. My algorithm for the splay operation is as follows:

  1. If the node to be splayed is the root, the unaltered tree is returned.
  2. If the node to be splayed is one level from the root, a zig operation is performed and the resulting tree is returned.
  3. If the node to be splayed is two or more levels from the root, a zig-zig or zig-zag operation is performed on the result of splaying the subtree starting at that node, and the resulting tree is returned.

This is valid according to my teacher. However, the Wikipedia description of a splay tree says the zig step "will be done only as the last step in a splay operation" whereas in my algorithm it is the first step in a splay operation.

I want to implement a splay tree that performs the zig operation last instead of first, but I'm not sure how it would best be done. It seems to me that such an algorithm would become more complex, seeing as how one needs to find the node to be splayed before it can be determined whether a zig operation should be performed or not.

How can I implement this in Haskell (or some other functional language)?

Example

In this example we search for the value 4, prompting us to splay it to the top of the tree.

My algorithm (zig as the first step)

1             1                   4
 \             \                 /
  2      zig    2    zig-zig    2
   \     -->     \   ------>   / \
    3             4           1   3
     \           /
      4         3

Wikipedia algorithm (zig as the last step)

1                   1           4
 \                   \         /
  2      zig-zig      4  zig  1
   \     ------>     /   -->   \
    3               3           3
     \             /           /
      4           2           2

Both trees are valid, but they have different structures. I want to implement the second one in a functional language, preferably Haskell.

A: 

Are you sure you're reading the Wikipedia description correctly? There are three kinds of steps: "zig", "zig-zig", and "zig-zag". The "zig" step is defined to be something that happens only when x is a child of the root. Despite the names, the "zig-zig" and "zig-zag" steps don't have "zig" steps as a first component.

It sounds to me like your implementation follows the Wikipedia description in this respect.

Travis Brown
Like you say, zig is defined to be something that happens only when `x` is a child of the root. In my algorithm, however, it happens anywhere in the tree, as the last step of the entire splay operation. My question doesn't concern the implementation of the zig-zig and zig-zag steps.
Jakob
Wait, doesn't it match the Wikipedia description then? In what sense is it the "first step" as well? Maybe it would be be clearer if you could post some code?
Travis Brown
I don't think any code I could post would make my question any clearer, since it would only follow the algorithm I've already posted. I can put together an example of splaying a sample tree in the two different ways, though.
Jakob
Check out my updated question for an example of the difference between the two algorithms.
Jakob
The original paper discusses "top-down" splaying, methods in §4, though only in the context of specific tree and pointer representation schemes, which make it hard to see abstractly. The paper states without proof that the efficiency Lemmas hold for these methods too. Hence, Jakob, your original approach was a valid splay tree implementation. None the less, trying to match the "bottom-up" method is an interesting exercise. I'd be curious to see who the Haskell code of the two approaches compare.
MtnViewMark
A: 

You can ref this course, which has a very good lecture note with code in OCaml for Splay tree.

Yin Zhu
That code uses the same algorithm, however. The lecture notes point out that "the case of a simple rotation occurs at the bottom of the tree rather than the top."
Jakob
+1  A: 

The key is to build a path to the value to be splayed, then rebuild the tree from the bottom, two levels at a time if possible (so that the zig-zip vs. zig-zag determination can be made):

data Tree a = Empty | Node a (Tree a) (Tree a)
    deriving (Eq, Show)

data Direction = LH | RH
    deriving (Eq, Show)


splay :: (Ord a) => a -> Tree a -> Tree a
splay a t = rebuild $ path a t [(undefined,t)]
    where path a Empty ps = ps
          path a n@(Node b l r) ps =
              case compare a b of
                  EQ -> ps
                  LT -> path a l $ (LH, l) : ps
                  GT -> path a r $ (RH, r) : ps

          rebuild :: (Ord a) => [(Direction,Tree a)] -> Tree a
          rebuild ((_,n):[]) = n
          rebuild ((LH,x):(_,p):[]) = zigL x p
          rebuild ((RH,x):(_,p):[]) = zigR x p
          rebuild ((LH,x):(LH,p):(z,g):ps) = rebuild $ (z, zigzigL x p g):ps
          rebuild ((RH,x):(RH,p):(z,g):ps) = rebuild $ (z, zigzigR x p g):ps
          rebuild ((RH,x):(LH,p):(z,g):ps) = rebuild $ (z, zigzagL x p g):ps
          rebuild ((LH,x):(RH,p):(z,g):ps) = rebuild $ (z, zigzagR x p g):ps

          zigL (Node x a b) (Node p _ c) = Node x a (Node p b c)
          zigR (Node x a b) (Node p c _) = Node x (Node p c a) b

          zigzigL (Node x a b) (Node p _ c) (Node g _ d) =
              Node x a (Node p b (Node g c d))

          zigzigR (Node x a b) (Node p c _) (Node g d _) =
              Node x (Node p (Node g d c) a) b

          zigzagL (Node x b c) (Node p a _) (Node g _ d) =
              Node x (Node p a b) (Node g c d)

          zigzagR (Node x b c) (Node p _ a) (Node g d _) =
              Node x (Node g d b) (Node p c a)

You can find this code, along with runnable unit tests and quick checks in my repo.

MtnViewMark
The `reverse` is a small sacrifice, which might be avoidable at the expense of a complicated fold. This code doesn't handle trying to splay a value that isn't in the tree. And I can think of several small clean-ups... but it's late and I wanted to get you the basic idea. I'll see if I can clean it up somemore tomorrow.
MtnViewMark
Refactored the code to make it clear, and added link to runnable version w/tests. Enjoy!
MtnViewMark
Removed the reverse, no odd fold needed. (Amazing what thinking in the shower can do!)
MtnViewMark
This seems to be working out great! As an aside, I still don't know a lot about calculating algorithmic complexity; does this have the same complexity as the original algorithm?
Jakob
The complexity is the same: `path` does a scan down to the element to splay. This is of course O(log n), and it is doing constant work per step (building a list). `rebuild` then scans that list, reducing it by one or two elements each step. The work it is doing is O(1) per step (no step involves a loop or traversal), so `rebuild` is then O(log n). Finally, `splay` is just two O(log n) processes strung together, so it is O(log n) as whole.
MtnViewMark