views:

697

answers:

4

I want to save a Btree(not sure a binary one) in a disk file. and then read it to the memory. some Level-order traversal may be a good way for a binary Btree. but if it is not a binary one. I build up the Btree from the leafnode to the rootnode in memory. I believe that I have to define some structures in the disk file and output the tree nodes. using some extra tag to identify a node in the file? how to traversal may be the key problem here. I coudn't figure out a good way to save the nodes and the pointers. and then read it. rconstruct the tree in memory. any good ideas?. thanks a lot.

+3  A: 

If you really want to do something similar, you can just assign at each node an id and save the nodes in that format:

[node-id value left-node-id right-node-id]

and then visit the tree with a breadth-first search.

When you want to reconstruct the tree, create a map id->node and then read backward the file: so, when you read a record, create the node, register it to the map and assign the left and right node fetching the nodes from the map.

akappa
+1  A: 

For each node define some data structure which will keep for you the same information the node has and add to that structure additional field which will mark for you the offset in the file for next son. And make the top field of that structure it's actual size, since you don't know what kind of tree you are looking now. Now by jumping over file you will be able to reconstruct your tree. I'm sure my solution is not final, but I hope it could be the good star point for you.

Artem Barger
A: 

You might want to check out Protocol Buffers. They're compact, binary, extensible, easy to read and write, and available in C++, Java and Python (as well as third party implementations in other languages).

You can define a protocol buffer message for a BTree node, with file offsets for child nodes, and simply serialize it to disk in the obvious manner.

Nick Johnson
A: 

The usual technique for B-Trees is to ensure that the size of a node is equal to the block size of the disk, and mmap the disk file. You don't specify what programming language you're working in, so it might be as simple as a cast in C, or something more complicated such as creating flyweight objects to wrap up a java.nio.IntBuffer. Either way, much of the advantage of the B-tree is that you don't have to load it all at once, but can jump around it fairly efficiently.

Pete Kirkham
He's talking about binary tree, not about B-Tree, despite the question title..
akappa