views:

612

answers:

1

I have a large AVL Tree which I build some times during program from an unsorted collection (it will be used to insert/remove items later).

Is there any better algorithm than using the simple insert on each item? Will it be more efficient to sort the collection first and then try to build it differently?

Profiling of my application tells me that this AVL building is a hotspot location.

+1  A: 

If the data conveniently fit into memory, I would indeed expect that doing a quicksort first, and building the tree from that would be faster than doing all the regular insertions.

To build the tree from an array, operate in a recursive manner, splitting the tree into three parts: a middle element, the left part, and the right part; both parts must have the same size (+-1), then form trees out of these parts. That guarantees that the resulting tree is nearly balanced (it will be perfectly balanced if the number of elements is 2^n-1). Tree creation should return the height of the tree so that you can put the balance conveniently into each node.

Edit: Assuming Ian Piumarta's tree.h, I believe this algorithm should do the trick:

Node* tree_build(int key[], int value[], int L, int R) // L and R inclusive
{

  int M;
  Node *middle;
  int lh, rh;

  if(L == R)
    return Node_new(key[L], value[L]);

  if(L+1 == R) {
    Node *left = Node_new(key[L], value[L]);
    Node *right = Node_new(key[R], value[R]);
    left->tree.avl_right = right;
    left->tree.avl_height = 1;
    return left;
  }

  // more than two nodes
  M = L + (R-L)/2;
  middle = Node_new(key[M], value[M]);
  middle->tree.avl_left = tree_build(key, value, L, M-1);
  middle->tree.avl_right = tree_build(key, value, M+1, R);
  lh = middle->tree.avl_left->tree.avl_height;
  rh = middle->tree.avl_right->tree.avl_height;
  middle->tree.avl_height = 1 + (lh > rh ? lh:rh);
  return middle;
}
Martin v. Löwis
If the key of the data is something amenable to bin sorting, the sort could be even faster. The complexity of building a AVL tree as you describe is O(n).
Kathy Van Stone
Yes i would also expect this to be faster especially because the key comparisons might be expensive. But i hoped to see some psyeudo code or a link to a AVL library which contains a feature like this as i have to say i find the whole balancing stuff and rotations quite complicated.
Lothar
@Lothar: see my edit. If you want code that really helps you, at a minimum, you should have posted your definition of a node.
Martin v. Löwis