I'm working on a project where I need to sort a list of user-submitted articles by their popularity (last week, last month and last year).
I've been mulling on this for a while, but I'm not a great statitician so I figured I could maybe get some input here.
Here are the variables available:
- Time [date] the article was originally published
- Time [date] the article was recommended by editors (if it has been)
- Amount of votes the article has received from users (total, in the last week, in the last month, in the last year)
- Number of times the article has been viewed (total, in the last week, in the last month, in the last year)
- Number of times the article has been downloaded by users (total, in the last week, in the last month, in the last year)
- Comments on the article (total, in the last week, in the last month, in the last year)
- Number of times a user has saved the article to their reading-list (Total, in the last week, in the last month, in the last year)
- Number of times the article has been featured on a kind of "best we've got to offer" (editorial) list (Total, in the last week, in the last month, in the last year)
- Time [date] the article was dubbed 'article of the week' (if it has been)
Right now I'm doing some weighting on each variable, and dividing by the times it has been read. That's pretty much all I could come up with after reading up on Weighted Means. My biggest problem is that there are some user-articles that are always on the top of the popular-list. Probably because the author is "cheating".
I'm thinking of emphasizing the importance of the article being relatively new, but I don't want to "punish" articles that are genuinely popular just because they're a bit old.
Anyone with a more statistically adept mind than mine willing to help me out?
Thanks!