views:

27

answers:

2

I want to use b-tree for index, but I can't think out an solution for OR query.

For OR query, I mean something like select * from table where id between 1 and 5 OR id between 10 and 15;

if I use id as the key in the b-tree, than how can I do query like above on the b-tree?

when search through the b-tree, assume that the key that are smaller than 6 and bigger than 6 is on different sub-trees, than when the search path go through the sub-tree that contaions the key that are smaller than 6, id that between 1 and 5 can be retrived, but what about id that between 10 and 15?

Do I have to use the b+tree,and when I found the key which points to id 1 , I just keep scan through the leaf nodes one by one until I found the key which points to id 15? Is it bad solution for this kind of query: select * from table where id between 1 and 5 OR id between 10000000 and 10000005???

Or is there any other solutions?

Thank you very much!

+1  A: 

An OR operation implies that two searches need to be done, and the results combined.

Anthony -GISCOE-
Do you mean I have to do the search on the b-tree twice to get the result?So what about an AND operation?Just change the query to:select * from table where id between 1 and 5 AND id between 10000000 and 10000005How about this query? Also do the search twice? Thank you very much!
Basically, yes, do two searches for both OR and AND. You can of course get much fancier as Markus suggests below, but that is more a matter of how complex you want to make your query engine.
Anthony -GISCOE-
+1  A: 

hi!

The OR keyword is a common problem. From index perspective it is usually best to do two lookups (e.g., like a UNION).

However, exceptions exist. Your first example (id between 1 and 5 OR id between 10 and 15) might be best done in one index lookup from 1 to 15, discarding the values 6-9. However, that depends on the data volume! You second example (between 1 and 5 OR id between 10000000 and 10000005) doesn't look to be a good candidate for that approach. However, it depends on the number of rows, not on the number of the id's.

Regarding AND: your example is a contradiction (id between 1 and 5 AND id between 10000000 and 10000005), the query will not return any rows. Some optimizers are able to "see" that.

AND conditions on different columns are to be solved with concatenated indexes.

Have a look at my Web-Book Use The Index, Luke! for further details.

Markus Winand