How does one automatically find categories for text based on content?
+1
A:
There is a good paper written on this: http://www.cs.utexas.edu/users/hyukcho/classificationAlgorithm.html
Geoffrey Chetwood
2008-09-15 18:38:01
A:
The best way to categorize content, be it text or multimedia is to use a taxonomy. Most of the well known CMSs have built in support for Taxonomy. Drupal has one of the best support for taxonomy among the various CMSs out there.
Jahangir
2008-09-15 18:53:07
I don't think I'd call this the best way. I'd call it *a way*.
Gregg Lind
2008-10-20 19:24:28
+1
A:
- Read Data Mining: Practical Machine Learning Tools and Techniques - Ian H. Witten, Eibe Frank
- Use Weka or Orange
Roberto Russo
2008-12-31 18:17:23
A:
I would encourage you to look at the text classification libraries bundled with the Natural Language Toolkit. Even if you're not familiar with Python I think you'll find the API rather intuitive. There are many good examples in the NLTK Book and the people on the mailing list are quite helpful as well.
theycallmemorty
2009-07-01 12:42:19