views:

56

answers:

1

My custom web servers report session information in the form of a wall clock timestamp (start of session) and the number of wall click seconds the session lasted.

I want to store this information in a data warehouse (MySQL + start schema) in a way that will allow me to for example query the number of sessions for a particular entity at a specific time. A requirement is that we must be able to extract time series data that will be feed to a graph.

Is there any other way to store the data than to insert a row for every absolute second of the session?

A: 

I would say that the simplest way is to store one row per session with StartTime, EndTime in the fact table.

Let's say we have factSession with:

  ( 
   ...
  ,SessionID int
  ,StartTime datetime
  ,EndTime datetime
  ,EntityID int
   ...
  ) ;

and

TimeSeries TABLE ( TimePoint datetime ) ;

You can:

SELECT  t.TimePoint
       ,f.EntityID
       ,COUNT(f.SessionID) AS cnt
FROM    TimeSeries AS t
        LEFT JOIN factSessions AS f ON ( f.StartTime <= t.TimePoint
                                          AND t.TimePoint <= f.EndTime
                                        )
GROUP BY t.TimePoint ,f.EntityID
ORDER BY t.TimePoint
Damir Sudarevic
Works like charm. Did a test with a timeseries table with ~700 rows and a fact table with ~350,000 rows. Quires take aprox ~45 seconds, even if I try to limit the number of rows in the session table. Is there a way to optimize this somehow? I added an index to the start and stop times, but without gaining anything. Here's an explanation of the query; http://pastebin.ca/1735275
jrydberg