ansaurus

Question

How to find a gap in a time ordered table where a given column does not have a certain value for a specified interval.

Answer 1

+2 A:

I think this will help you:

SELECT * FROM (
    SELECT t1.[time],
           t1.time - (SELECT MAX(time) FROM my_table t2 WHERE t2.time < t1.time and t2.base = 'a') AS timeout
    FROM my_table t1
    WHERE t1.base = 'a') d
WHERE timeout > 5

And don't forget to create index for this query to be more effective:

CREATE INDEX idx_my_table_time_base ON my_table (time, base)

alygin 2009-12-06 19:21:27

+1 Works, remove `WHERE t1.base = 'a'` and replace `t2.base = 'a'` with `t2.base = t1.base` to make it search for all base gaps

Andomar 2009-12-06 20:03:43

Yea. Use t2.base = t1.base in any case.

alygin 2009-12-06 20:35:19

Also interesting, the time is similar to the above, 6 seconds for a base with 3000 rows. Your suggested index isn't being picked by the optimizer either. Still way too slow.

naturalethic 2009-12-09 03:11:39

Try to update statistics with full scan on this index. May be server doesn't think this index is selective enought. If it doesn't help try to use hint in the inner select: "from my_tabe t2 with(index=my_index)". Does it help?

alygin 2009-12-09 10:09:50

Answer 2

A:

One way to approach this is to check for "stretch heads", that is, occurrences of a base with more than 5 seconds since it's last occurrence. This example query joins the table on itself to filter out non-heads:

select    head.* 
from      @t head
left join @t nohead 
on        head.base = nohead.base 
and       head.time - 5 < nohead.time and nohead.time < head.time
where     nohead.base is null
order by  head.[time]

For each row, the left join searches for the same base within the last 5 seconds. The where nohead.base is null clause says such a row may not exist. The effect is a list of when a 5+ second span without a base happens.

It won't list the last gap: you could explicitly add "end time" rows for each base:

<end time>     a
<end time>     b
...

to make the query check end-gaps.

Andomar 2009-12-06 19:59:37

Interesting solution, but it still takes way too much time -- 6 seconds on query against a base with 3000 rows.

naturalethic 2009-12-09 03:10:46

@naturalethic: I'd say add an index on (base,time)... but 6 seconds for 3000 rows is way too slow even if you didn't define any indexes.

Andomar 2009-12-09 09:19:24

Answer 3

A:

One possibility, if you are using a database that supports windowing/analytic functions is something like this:

select * from (
    select time,
           base,
           time - lag(time) over(partition by base order by time) as interval
    from example) w
where w.interval > 5

This should be able to work from a single scan of a (base,time) index. It works on PostgreSQL 8.4 and I think should also work on SQL Server 2008 and Oracle 10.

araqnid 2009-12-06 23:11:24

ansaurus

tags:

views:

answers:

How to find a gap in a time ordered table where a given column does not have a certain value for a specified interval.

related questions