views:

93

answers:

1

Take an example of a question/answer site with a 'browse' slideshow that will show one question/answer page at a time. The user clicks the 'next' button and a new question/answer is presented to him.

I need to decide which pages should be returned each time the user clicks 'next'. Some things I don't want and reasons why:

  • Showing 'newest' questions in descending order:

    Say 100 questions get entered, then no user is going to click thru to the 100th item and it'll never get any responses. It also means if no new questions were asked recently, every time the user visits the site, he'll see the same repeated stale data.

  • Showing 'most active' questions, judged by a lot of suggested answers/comments:

    This won't return those questions that have low activity, which are exactly the ones that need more visibility

  • Showing 'low activity' questions, judged by not a lot of answers/comments:

    Once a question starts getting activity, it'll stop being shown. This will stymie the activity on a question, when I'd really like to encourage discussion.

I feel that a mix of these would work well, but I'm unsure of how to judge which pages should be returned. I'll stress that I don't want the user to have to choose which category of items to view (like how SO has the unanswered/active/newest filters).

Are there any common practices for doing this, or any ideas for how it might be done?

Thanks!

Edit:

Here's what I'm leaning towards so far, with much thanks to Tim's comment: So far I'm thinking of ranking pages by Activity Count / View Count, where activity is incremented each time a user performs an action on a page, like a vote, comment, answer, etc. View will get incremented for each page every time a person views the page.

I'll then rank all pages by their activity/view ratio and show pages with a high ratio more often. This way pages with low activity and high views will be shown the least, while ones with high activity and low views will be shown most frequently. Low activity/low views and high activity/high views will be somewhere in the middle I imagine, but I'll have to keep a close eye on this in the beta release. I also plan on storing which pages the user has viewed in the past 24 hours so they won't see any repeats in the slideshow in a given day.

Some ideas for preventing 'stale' data (if all the above doesn't seem to prevent it): Perhaps run a cron job which will periodically check for pages that haven't been viewed recently and boost their ratio to put them at the top.

+1  A: 

As I see it, you are touching upon two interesting questions:

  1. How to define that a post is interesting to a user: Here you could take a weighted combination of various factors that could contribute to interestingness of a post. Amount of activity, how fresh the entry is, if you have a way of knowing that the item matches users interest etc etc. You could pick the weights based on intuition and see how well the result matches your expectation. If you have the time and inclination, you could collect data on how well your users respond to the entries and try to learn the optimum weights for each factor using machine learning techniques.

  2. How to give new posts a chance, otherwise known as exploration-exploitation tradeoff. BAsically, if you just keep going to known interesting entries then you will maximize instantaneous user happiness, but you will never learn about new interesting stuff hence, overall your users are unhappy.

This is a very well studies problem, and depending upon how much you want to get into it, you can read up literature on things like k-armed bandit problems.

But a simple solution would be to not pick the entry with the highest score, but pick the entry based on a probability distribution such that high score entries have higher probability of showing up. This way most of the times you show interesting stuff, but every post has a chance to show up occasionally.

Amit Prakash
Thanks for the advice to look into the k-armed bandit problem. This is a very well-explained answer.
Cuga