For finding trending topics, I use the Standard score in combination with a moving average:
z-score = ([current trend] - [average historic trends]) / [standard deviation of historic trends]
Until now, I do it as follows:
Whatever the time is, for the historic trends I simply go back 24h. Assuming we have January 12, 3:45pm now:
current_trend = hits [Jan 11, 3:45 - Jan 12, 3:45]
historic_trends = hits [Jan 10, 3:45 - Jan 11, 3:45] + hits [Jan 9, 3:45 - Jan 10, 3:45] + hits [Jan 8, 3:45 - Jan 9, 3:45] + ...
But is this really adequate? Wouldn't it be better if I always started at 00:00 o'clock? For example this way for the same data (3:45pm):
current_trend = hits [Jan 11, 0:00 - Jan 12, 0:00]
historic_trends = hits [Jan 10, 0:00 - Jan 11, 0:00] + hits [Jan 9, 0:00 - Jan 10, 0:00] + hits [Jan 9, 0:00 - Jan 9, 0:0] + ...
I'm sure the results would be different. But which approach will give you better results?
I hope you've understood my question and you can help me. :) Thanks in advance!