views:

360

answers:

3

if you don't know what does meme mean you can read this article readwriteweb

my question is how to create a meme algorithm, I have a website which aggregated thousands of blogs posts and I want to figure the most talked about stories.

see this quotation from the article above

"Meme aggregation attempts to cut down on the signal to noise ratio by figuring out what is the most talked about news (and thus, hopefully, the most important)."

does anyone know how to do this?,

is their any easy tutorials?

because I am not that good at maths.

Thanks

A: 

Assuming you want to find the most popular subject? The actual calculation could be quite simple, however the amount of data needing to be processed will be large.

(Number blog posts with the specific tag / total volume blog posts) = The popularity of a tag

Obviously you would need a list of common tags/words to ignore

Then the most popular post related to that tag = The most commonly linked blog post from the other posts which contained that tag.

Also, more sophisticatedly, you can calculate the weight of a link using a pagerank style calculation. http://www.webworkshop.net/pagerank.html - Which is effectively the probability that when randomly browsing you will land on a specific page i.e. The most popular

/My 2cents

Ben Reeves
+1  A: 

There is no "correct" way of doing such a thing. There are different ways of accomplishing this, and you need to choose one that is something you can implement/run and behaves in the way you like. Start with something simple that you understand and go from there.

For example:

Ben Reeves suggested "(Number blog posts with the specific tag / total volume blog posts)" and a pagerank approach. If these select topics in an appropriate manner for you, go with them.

Here are a couple of other suggestions,

You could add weights for posts that are dependent on how popular the hosting web site is. For example, something posted on the New York Times should probably be considered more popular than something on Joe Shmoes blog and should receive more weight. This is similar to a page rank approach, and may in practice have little difference.

You could add a time factor, so how fast posts come for a topic matters. E.g. if topic B has 30 posts from last week, and topic C has 10 posts from today you might want to consider topic C to be more popular. What if topic D has 2 posts a week over the last year? What about topic E that has 5 posts in the last hour?

leif
+1  A: 

Variables:

  • Count
  • Time
  • Content

Count the number of times the content occurs. If it occurs sufficiently often then it qualifies. It also needs to have occurred recently otherwise the count is not relevant. The content needs to be well related to avoid false positives.

Have a look at the Yahoo contextual search and keywords API for starters.

aleemb