I have to build a tag cloud out of a webpage/feed. Once you get the word frequency table of tags, it's easy to build the tagcloud. But my doubt is how do I retrieve the tags/keywords from the webpage/feed?
This is what I'm doing now:
Get the content -> strip HTML -> split them with \s\n\t(space,newline,tab) -> Keyword list
But this does not work great.
Is there a better way?