tags:

views:

77

answers:

3

I'm trying to build a rss-news fetching server to collect all news of a few sites about a topic. Often these sites have similar news with nearly the same information. How would it be possible to group such news. For example display the first and then a summary of other links?

Does anybody have experince with this?

+2  A: 

Look for keywords (e.g., split the description into words and remove any of the 100 or so most common words) then clump them by cooccurance of these. Often just looking at the longest word will give you a good quick approximation.

In other words, if you have a table with "topic groups" you can assign each item to a new or existing topic group as it comes in. First, see if any of the existing topic groups share enough keywords with the new item; if one does, put it there. If none does, create a new topic group with its keywords and add it as the first member of that topic group.

-- MarkusQ

MarkusQ
A: 

hi i am also facing the same problem... please share more info if you have solved your issue.

Gourav