views:

108

answers:

2

Hi there,

I'm grouping some records by their proximity of time. What I do right now (timestamps in unixtime),

First off I do a sub select to grab records that are of interest of me,

(SELECT timestamp AS target_time FROM table WHERE something = cool) AS subselect

Then I want to look at the records that are close in time to those,

SELECT id FROM table, subselect WHERE ABS(target_time - timestamp) < 1800 

But here is where I hit my problem. I want to only want the records where the time diffrance between the records around the target_time is > 20 mins. So to do this, I group by the target_time and add a HAVING section.

SELECT id FROM table, first WHERE ABS(target_time - timestamp) < 3600 
GROUP BY target_time HAVING MAX(timestamp) - MIN(timestamp) > 1200

This is great, and all the records I don't like are gone, but now I only have the first id of the group, when I really want all of the ids. I can use GROUP_CONCAT but that gives me a be mess I can't do anymore queries on. What I really would like it to get all of the ids returned from all of these groups that are created. Do I need another SELECT statement? Or is there just a better way to structure what I got?

Thank you,

A SQL nub.

A: 

My assumption is you want an output that looks like:

   id1, timestamp1, fieldA, fieldB
   id1, timestamp2, fieldA, fieldB
   id2, timestamp3, fieldA, fieldB
   id2, timestamp4, fieldA, fieldB
   id3, timestamp5, fieldA, fieldB
   id3, timestamp6, fieldA, fieldB

but the timestamp for these records is BETWEEN 1200 and 1800 seconds of a "target_time" where something = cool?

   SELECT data.id, data.timestamp, data.fieldA, data.fieldB, ..., data.fieldX
     FROM events 
     JOIN data 
    WHERE events.something = cool_event    -- Gives the 'target_time' of cool_event
      AND ABS(event.timestamp - data.timestamp) BETWEEN 1200 and 1800   -- gives data records 'near' target time, but at least 20 minutes away.

IF the 'data' and 'events' table are the SAME table, then just use table alias names, but you can join a table to itself, aka 'SELF-JOIN'.

   SELECT data.id, data.timestamp, data.fieldA, data.fieldB, ..., data.fieldX
     FROM events AS target,  events AS data
    WHERE target.something = cool_event    -- gives the 'target_time' of cool_event
      AND ABS(target.timestamp - data.timestamp) BETWEEN 1200 and 1800   -- gives data records 'near' target time, but at least 20 minutes away.

This sounds about right, and is without any group-by or aggregates needed.

You can order the resulting data if necessary.

-- J Jorgenson --

J Jorgenson
Thats not quite what I need. Lets say the target time is midnight. I want all the records from 11:30-12:30. But lets say when I get those records the only ones in that time are from 11:56-12:05. The time they stretch isn't long enough. So I want the ids of all of the records around the target times where the group returned stretches more then 20mins. Confusing I know. (Hence the bounty). Thanks though!
Eric Koslow
+1  A: 

See if I have your problem correct:

For a given row in a table, you want to know the set of rows for similar records if the range of timestamps for those records is greater than 20 minutes. You want to to this for all ids in the table.

If you simply want a list of ids which fulfil this criteria, it is fairly straightforward:

given a table like:

create table foo (id bigint(4), section VARCHAR(2), modification datetime);

you can do:

select id, foo.section, min_max.min_modification, min_max.max_modification, abs(min_max.min_modification  - min_max.max_modification) as diff
from foo,
(select section, max(modification) max_modification, min(modification) min_modification from foo as inner_foo group by section) as min_max
where foo.section = min_max.section
and abs(min_max.min_modification  - min_max.max_modification) > 1800;

You're doing a subselect based on the 'similar rows' criteria (in this case the column section) to get the minimum and maximum timestamps for that section. This min and max applies to all ids in that section. Hence, for section 'A', you will have a list of ids, same for section 'B'.

MatthieuF