ansaurus

Question

Answer 1

+1 A:

Using **, you need an extra allocation step to hold each BPlusNode* child pointer. Or you could allocate a block of them and just have each pointer in children point to sequential BPlusNode* elements inside this block -- but it's still one extra dynamic memory allocation per node creation (and a corresponding extra deallocation step on destruction). So I would absolutely recommend using a single *. If writing

someFunction((currNode->children) + ChildIndex);

hurts you, you can rewrite it as

someFunction(&currNode->children[ChildIndex]);

which I find clearer.

j_random_hacker 2009-09-11 03:15:12

But, isn't doing it that way the same as doing the pointer arithmetic, dereferncing the result of the arithmetic, and then retrieving the this pointer of the Node? Also, I only need to do insertions to the tree all in one rush, in fact there will be about 2.557 billion insert calls with the current data set (Though due to page structure only about 255,700 buckets). Since the tree is only of order 256 there is going to be a lot of splitting in that initial rush, so, wouldn't we benefit from the increased ease of resizing the arrays?

James Matta 2009-09-11 03:27:04

j_random_hacker 2009-09-11 03:37:49

James Matta 2009-09-11 03:40:30

Insertions in the middle is one reason to prefer `**`. But I thought that the idea of B-trees was that they only grew to a (fairly small) fixed number of child nodes before splitting -- is that right? If so, and if that threshold is less than a few hundred nodes, I'd bet that it is still faster to just copy the node pointers around when inserting in the middle. `memcpy()` is fast; per-child-insertion dynamic memory allocation is not (and also will more than double your memory requirements, since each allocation requires storage of maybe 4-16 bytes of housekeeping info).

j_random_hacker 2009-09-11 03:59:58

Well, as I said before the current data set is about 2.557 billion events, which means that there will be that many insert calls. Fortunately a bucket is just a set of indexes into a file and each bucket holds about 10000 events. So I would wind up with about 255,700 buckets and of course a commensurately smaller number of leaf nodes.

James Matta 2009-09-11 04:27:34

In another comment you indicate that `keyType` is a "big, ugly thing" -- in that case, it probably is better to use a 2nd level of indirection (i.e. `**`) to enable efficient insertion into the middle of the array. That is unless `keyType`'s constructor itself does dynamic memory allocation, in which case the actual object size will be small and the insertion logic can do a shallow copy (copy pointers inside of the `keyType` objects instead of the objects themselves).

j_random_hacker 2009-09-11 05:31:24

BTW it occurs to me that you appear to be implementing a very simple database here -- have you thought about using a preexisting DB instead of writing your own? They are basically all B+-tree based, performant, thoroughly tested, and will give you much more flexibility for querying.

j_random_hacker 2009-09-11 05:32:55

@j_random_hacker: I have found that too many DBs like to duplicate the indexing data. With your own BTree there is only one copy. Of course, some DBs do offer index-only "tables" without the table, but I remember all the ones I found were expensive.

Zan Lynx 2009-09-12 03:33:52

Answer 2

+2 A:

I would define another struct for the key and pointer data. I would commit to using fixed size nodes which should match your on-disk structure. This makes memory mapping the tree a lot easier.

Your BPlusNode struct becomes a handle class that points to these mapped data nodes and synthesizes the things like prev and next pointers by reading the siblings as it descends the tree.

It could look something like the following:

enum BPlusNodeType {
    LEAF, BRANCH
};

struct BPlusNodeData {
    static const size_t max_size = 511; // Try to fit into 4K? 8K?
    uint16_t size;
    uint16_t type;
    keyType key[max_size];
    union {
        Bucket* data[max_size];
        BPlusNodeData* children[max_size];
    };
};

Zan Lynx 2009-09-11 04:05:02

+1. A fixed-size, in-place array of node pointers will be faster and cleaner (no dynamic memory (de/re)allocations to forget) than either a `*`- or `**`-based solution.

j_random_hacker 2009-09-11 04:11:25

I rather like the elegance here but I am a bit fuzzy on the syntax. I know how a union works in the second context but how does the first union declaration fit in with anything?

James Matta 2009-09-11 04:33:31

Also, what impact would there be on how you do this if I told you that keyType is a big ugly thing itself. Unfortunately, keyType is really something I call numericalArray, it is essentially a dynamic array for storing 1 to n dimensional coordinates and then converting those coordinates to a 1D index with a hilbert space filling curve (using some bitwise magic). Does that impact how you would do it?

James Matta 2009-09-11 04:35:24

Perhaps you meant enum instead of union when you are defining BPlusNodeType?

James Matta 2009-09-11 04:51:28

@James: Yes, I'm sure he meant `enum` there. I suggest -1ing him to get his attention.

j_random_hacker 2009-09-11 05:15:34

Waited too long, my up vote cannot be changed. I will ride with it meaning enum for now.

James Matta 2009-09-11 05:24:21

I edited it myself. Don't think he'll mind :)

j_random_hacker 2009-09-11 05:35:28

Yes thanks, I did mean enum there. That union was screwy.

Zan Lynx 2009-09-11 13:46:44

@James Matta: If you have converted to a 1D index key value and that index isn't too long, then I would choose a fixed size for each key and stay with a fixed size array.

Zan Lynx 2009-09-11 13:49:38

Answer 3

A:

Would you be better off using STL 'vector<keyType *> keys' and 'vector<BPlusNode *> children', and so on?

It might be too simplistic, but my impression is that double-indirection is not often needed in C++ (and not all that often in C, though more often than in C++).

Jonathan Leffler 2009-09-11 06:09:37

Using the STL adds slows things down yet more. This program has to handle a massive number of reads from the index (and an even more massive number from the disk). I want queries to this tree to be as fast as possible.

James Matta 2009-09-11 08:19:10

Finally, usting std::vector<keyType *> etc. makes nodes larger. Vector has some space overhead associated with it, and while that is a bit less of an issue when I only have a couple billion events and thus a couple hundred thousand pages and only a thousand nodes or so this is not so bad (despite keyType taking up substantial room on its own). But when one looks at a situation where there are several hundred billion or, worse, a couple trillion events (amounts that are sometimes necessary) then one finds that as little memory overhead as possible is best.

James Matta 2009-09-11 08:22:19

@James: Reading/writing `vector<keyType*>` will be no slower than reading/writing a dynamically-allocated array of `keyType*` -- any decent compiler will generate identical code for both. OTOH the `vector` space overhead is real (though I doubt significant). Normally I'd advocate replacing any fixed-size or dynamically-allocated array with a `vector`, but in this case I think it's appropriate to go for Zan Lynx's fixed-size array approach, which will be both smaller and faster.

j_random_hacker 2009-09-12 13:05:54

ansaurus

tags:

views:

answers:

B+ tree implementation, * * vs *

related questions