views:

35

answers:

2

Hi

I've been trying to learn Text mining and other related things in Collective Intelligence field. I am interested to make an app which will scan thru the document and show related posts/articles on page.

What algorithm(s) would be helpful to retrieve required info?

Thanks

/A

+1  A: 

A simple method is to count the non-common words and their instances on the page. The more a word shows up, the better it is at describing the content of the post. You can then use it to look up other articles/posts.

Jonathan Sampson
+1  A: 

You can use Resource Description Framework (RDF). RDF bases contain structured knowledge and connections between them. So, you can get RDF records for every word in text and connect them in graph. Nodes with maximum number of edges and root nodes (if the graph is like a tree) will refer to the theme of the document.

Tiendil
Any example of what you mentioned in context of extracting related content?
Volatil3