no simple solution
There is no way to do this in a single SQL statment.
Below are 2 ideas: one uses a loop to count visits, the other changes the way the visiting
table is populated.
loop solution
However, it can be done without too much trouble with a loop.
(I have tried to get the postgresql syntax correct, but I'm no expert)
/* find entries where there is no previous entry for */
/* the same visitor within the previous hour: */
select v1.* , 0 visits
into temp_table
from visiting v1
where not exists ( select 1
from visiting v2
where v2.visitor_id = v1.visitor_id
and v2.visit_time < v1.visit_time
and v1.visit_time - interval '1 hour' < v2.visit_time
)
select @rows = @@rowcount
while @rows > 0
begin
update temp_table
set visits = visits + 1 ,
last_time = v.visit_time
from temp_table t ,
visiting v
where t.visitor_id = v.visitor_id
and v.visit_time - interval '1 hour' < t.last_time
and not exists ( select 1
from visiting v2
where v2.visitor_id = t.visitor_id
and v2.visit_time between t.last_time and v.visit_time
)
select @rows = @@rowcount
end
/* get the result: */
select visitor_id,
visits
from temp_table
The idea here is to do this:
- get all visits where there is no prior visit inside of an hour.
- this identifies the sessions
- loop, getting the next visit for each of these "first visits"
- until there are no more "next visits"
- now you can just read off the number of visits in each session.
best solution?
I suggest:
- add a column to the
visiting
table: session_id int not null
- change the process which makes the entries so that it checks to see if the previous visit by the current visitor was less than an hour ago. If so, it sets
session_id
to the same as the session id
for that earlier visit. If not, it generates a new session_id
.
- you could put this logic in a trigger.
Then your original query can be solved by:
SELECT session_id, visitor_id, count(*)
FROM visiting
GROUP BY session_id, visitor_id
Hope this helps. If I've made mistakes (I'm sure I have), leave a comment and I'll correct it.