Update:
See this article in my blog for efficient indexing strategy for your query using computed columns:
The main idea is that we just compute rounded length
and startDate
for you ranges and then search for them using equality conditions (which are good for B-Tree
indexes)
In MySQL
and in SQL Server 2008
you could use SPATIAL
indexes (R-Tree
).
They are particularly good for the conditions like "select all records with a given point inside the record's range", which is just your case.
You store the start_date
and end_date
as the beginning and the end of a LineString
(converting them to UNIX
timestamps of another numeric value), index them with a SPATIAL
index and search for all such LineString
s whose minimum bounding box (MBR
) contains the date value in question, using MBRContains
.
See this entry in my blog on how to do this in MySQL
:
and a brief performance overview for SQL Server
:
Same solution can be applied for searching a given IP
against network ranges stored in the database.
This task, along with you query, is another often used example of such a condition.
Plain B-Tree
indexes are not good if the ranges can overlap.
If they cannot (and you know it), you can use the brilliant solution proposed by @AlexKuznetsov
Also note that this query performance totally depends on your data distribution.
If you have lots of records in B
and few records in A
, you could just build an index on B.dates
and let the TS/CIS
on A
go.
This query will always read all rows from A
and will use Index Seek
on B.dates
in a nested loop.
If your data are distributed other way round, i. e. you have lots of rows in A
but few in B
, and the ranges are generally short, then you could redesign your tables a little:
A
start_date interval_length
, create a composite index on A (interval_length, start_date)
and use this query:
SELECT *
FROM (
SELECT DISTINCT interval_length
FROM a
) ai
CROSS JOIN
b
JOIN a
ON a.interval_length = ai.interval_length
AND a.start_date BETWEEN b.date - ai.interval_length AND b.date