tags:

views:

91

answers:

2

Do b trees and b+ trees only store data at their leafs? I am assuming that they use their internal nodes to search the required data.

Is that the case or do they store data in every node?

+1  A: 

All data is in the leaves.

Wiki on B+.

Stefan Kendall
+5  A: 

Non-leaf nodes "records" contain

  • a pointer (a node "address" of sorts) to a node in the next level down the tree
  • the value of the key of the first (or the last, depending on implementation) record in that node

Such non-leaf "records" are listed in key order so that by scanning (or binary searching within) a non-leaf node, one knows which node in the next level down may contain the searched value.

Leaf nodes records contain complete data records: the key value and whatever else.

Therefore "real" data is only contained in the leaf nodes, the non-leaf nodes only contain [a copy of] the key values. for a very small proportion of the data (this proportion depends on the average number of data records founds in a leaf node).

This is illustrated in the following image from the Wikipedia article on B+ Trees simple btree

The non-leaf node, at the top, (the only one in this simplistic tree) only contains two non-leaf node records, each with a copy of a key value (bluish color) and a pointer to the corresponding node (gray color). This tree happens to only have two levels, therefore the "records" in root node point to leaf nodes. One can imagine that there are additional levels (above the topmost tree shown below, call it the "3-5 node"); if that were the case the node above would contain (along with other similar records), a record with key value 3 with a pointer to the "3-5" node.
Also note that only the key values 3 and 5 are contained in non-leaf nodes (i.e. not even all key values are reproduced in the non leaf-nodes).
BTW in this example the non-leaf nodes contain the key of the last record in the next node (would also work if the first record were used instead, slight difference in the way the search logic is then implemented).

The leaf nodes contain the key value (in bluish color too) and the corresponding data record (d1, d2... shown in grey). The red-ish pointer shown at the end of each leaf node point to the next leaf node, i.e. the one containing the very next data record in key order; these pointers are useful to "scan" a range of data records.

mjv
cleared every concept of b+tree...! thanks a lot
rohit