views:

43

answers:

1

I started working on a project in which i must tag documents with keywords, and it is really hard and time consuming if you do it manually (specially if you have thousands of documents). So I am planning to automatize the process (knowing that the result would not perfect but at least it gives you some suggested tags ). In the latest firefox version they implemented a system like this (when you bookmark a page, it suggests you some tags).

yahoo term extraction service is also a great example

So if any body can help me get around this problem I would really appreciate the help. Or if someone know about the firefox tagging system a little bit of help would be great.

+1  A: 

Would a statistical algorithm work? Something Bayesian perhaps? I know they're used in spam filtering, maybe you can adapt a Bayes filter to suit your needs.

At the very least, you could suggest words that are used frequently but are not common words in English (he, she, I, and, it, then, or, etc...)

Charlie Salts