views:

31

answers:

1

Hi: I am trying to build a web log analyzer based to the tomcat log.

And I first push the log to the database, then do some Statistics.

Now I meet a problem:

For a given data range for example (2010-09-20 to 2010-09-25), I have to calculate the visitors of each day,so I first split the data range day by day:


Split:(2010-09-20,2010-09-25) to 

(2010-09-20 00:00:00, 2010-09-21 00:00:00),
(2010-09-21 00:00:00, 2010-09-22 00:00:00),
(2010-09-22 00:00:00, 2010-09-23 00:00:00),
(2010-09-23 00:00:00, 2010-09-24 00:00:00),
(2010-09-24 00:00:00, 2010-09-25 00:00:00),


Then use the sql to query the num of the visitors of each range.

For example:

select count(distinct ip) from log where time between 201009200000 and 201009210000 group by ip.

This sql is used to calculate the visitors of 2010-09-20.

So if the data range from the user request covers more than one days,I should connect the database more than one times, is it Low efficiency?

Any solution?

BWT, I use MySQL.

A: 

You start by making a table (days) with all days in a year.

E.g. select count(distinct ip),day.starttime from log inner join days on log.time between day.starttime and day.endtime where log.time between 201009200000 and 201009210000 group by ip,day.starttime

or something like that

MikeG