It can be done in a single SQL statement. Here's the proof.
Setup
CREATE TEMP TABLE eventtable
(
name CHAR(3) NOT NULL,
time DATETIME HOUR TO MINUTE NOT NULL,
state CHAR(8) NOT NULL
);
INSERT INTO eventtable(name, time, state) VALUES('a', '10:00', 'login');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:05', 'login');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:06', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:08', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:10', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:12', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:01', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:02', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:03', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:04', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:05', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:06', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:07', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:08', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:09', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:12', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:13', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:14', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:15', 'Logout');
Correct query
Note the conditions. The result table must exclude the periods between 'login' and the first other event; further, it must exclude the period between 'Logout' and the next event (presumably a 'login'). The self-join between the table on the name column and then the asymmetric join on the time column (using '<
') ensures that events are in time order. The NOT EXISTS sub-select ensures that only adjacent events are considered. Using BETWEEN AND in the sub-query is a mistake because it includes its end points and it is crucial that r1.time
and r2.time
are excluded from the range; it took me a few minutes to spot that bug (the query ran but returned no rows, but why?)!
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS duration
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND NOT EXISTS (SELECT 1 FROM eventtable r3
WHERE r3.time > r1.time AND r3.time < r2.time
AND r3.name = r1.name
AND r3.name = r2.name);
This produces the answer:
name state duration
a chatting 0:02
a Idle 0:02
b chatting 0:01
c chatting 0:01
c Idle 0:01
c Idle 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
The 'duration' value is an INTERVAL HOUR TO MINUTE; if you want a value in just minutes, you have to convert it with a cast (using 4 for the precision to allow for intervals up to 1440 minutes, or 1 day; the data is ambiguous for longer time frames):
(r2.time - r1.time)::INTERVAL MINUTE(4) TO MINUTE
Or:
CAST (r2.time - r1.time AS INTERVAL MINUTE(4) TO MINUTE)
IBM Informix Dynamic Server (IDS) has very verbose notations for time constants. In Standard SQL, you could use TIME as the type and TIME '10:00:00' as a value, but the seconds would be necessary in strict standard SQL. IDS does provide exact types that people want - such as DATETIME HOUR TO MINUTE. You'd also write INTERVAL MINUTE(4) in standard SQL; the 'TO MINUTE' should be optional.
Incorrect query
In my comment to Ray Hidayat's answer, I pointed out that the EXISTS sub-query is necessary to ensure that the events under consideration are contiguous - there are no intervening events. Here's the same query with start and end times added to the output, and the EXISTS clause missing (and 'duration' renamed to 'lapse'):
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS lapse,
r1.time AS start, r2.time AS end
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE;
This produces the answer:
name state lapse start end
a chatting 0:04 10:06 10:10
a chatting 0:02 10:06 10:08
a Idle 0:02 10:08 10:10
b chatting 0:01 10:11 10:12
c chatting 0:13 10:02 10:15
c chatting 0:12 10:02 10:14
c chatting 0:11 10:02 10:13
c chatting 0:10 10:02 10:12
c chatting 0:09 10:02 10:11
c chatting 0:07 10:02 10:09
c chatting 0:06 10:02 10:08
c chatting 0:05 10:02 10:07
c chatting 0:04 10:02 10:06
c chatting 0:03 10:02 10:05
c chatting 0:02 10:02 10:04
c chatting 0:01 10:02 10:03
c Idle 0:12 10:03 10:15
c Idle 0:11 10:03 10:14
c Idle 0:10 10:03 10:13
c Idle 0:09 10:03 10:12
c Idle 0:08 10:03 10:11
c Idle 0:06 10:03 10:09
c Idle 0:05 10:03 10:08
c Idle 0:04 10:03 10:07
c Idle 0:03 10:03 10:06
c Idle 0:02 10:03 10:05
c Idle 0:01 10:03 10:04
c Idle 0:10 10:05 10:15
c Idle 0:09 10:05 10:14
c Idle 0:08 10:05 10:13
c Idle 0:07 10:05 10:12
c Idle 0:06 10:05 10:11
c Idle 0:04 10:05 10:09
c Idle 0:03 10:05 10:08
c Idle 0:02 10:05 10:07
c Idle 0:01 10:05 10:06
c Idle 0:08 10:07 10:15
c Idle 0:07 10:07 10:14
c Idle 0:06 10:07 10:13
c Idle 0:05 10:07 10:12
c Idle 0:04 10:07 10:11
c Idle 0:02 10:07 10:09
c Idle 0:01 10:07 10:08
c chatting 0:04 10:11 10:15
c chatting 0:03 10:11 10:14
c chatting 0:02 10:11 10:13
c chatting 0:01 10:11 10:12
c Idle 0:03 10:12 10:15
c Idle 0:02 10:12 10:14
c Idle 0:01 10:12 10:13
c chatting 0:02 10:13 10:15
c chatting 0:01 10:13 10:14
c Idle 0:01 10:14 10:15
This shows how each eligible start row for user 'c' is matched with each eligible end row, giving many spurious rows of data. The NOT EXISTS sub-query is a common theme when dealing with time-based queries. You can find information about these operations in Snodgrass's "Developing Time-Oriented Applications in SQL" (PDF available online at URL), and in Date, Darwen and Lorentzos "Temporal Data and the Relational Model".