views:

46

answers:

1

Hello,

Let's imagine that we have high traffic project (a tube site) which should provide sorting using this options (NOT IN REAL TIME). Number of videos is about 200K and all information about videos is stored in MySQL. Number of daily video views is about 1.5KK. As instruments we have Hard Disk Drive (text files), MySQL, Redis.

Views
 top viewed
 top viewed last 24 hours
 top viewed last 7 days
 top viewed last 30 days
 top rated last 365 days

How should I store such information?

The first idea is to log all visits to text files (single file per hour, for example visits_20080101_00.log). At the beginning of each hour calculate views per video for previous hour and insert this information into MySQL. Then recalculate totals (for last 24 hours) and update statistics in tables. At the beginning of every day we have to do the same but recalculate for last 7 days, last 30 days, last 365 days. This method seems to be very poor for me because we have to store information about last 365 days for each video to make correct calculations.

Is there any other good methods? Probably, we have to choose another instruments for this?

Thank you.

A: 

If absolute precision is not important, you could summarize the information that is longer than 2 units back.

You would store the individual views for the last 1-2 hours, the hourly views (one value per hour) for the last 1-2 days, and the daily views (one value per day) further.

"1-2" means that you store until you have two units full, then summarize the earlier of them.

Svante