views:

140

answers:

3

i have a bunch of data harvested from a forum I own, and would like to do some text mining or use some linguistic library to extract useful information.

any text mining, data mining library in any language will do.

Thank you.

A: 

Mallet is a java library designed for text mining. Once you have preprocessed the text data, a general data mining tool like Weka would also suffice your task.

If you have access to SPSS or SAS, their products should be more easier to use.

Yin Zhu
+1  A: 

You may like to have a look at the Python NLTK (Natural Language ToolKit): it's specifically designed for this kind of thing.

There is also a great book you can but to get you started.

jkp
+1  A: 

I recommend that you have a look at R. It has an extensive number of text mining packages: have a look at the Natural Language Processing view. In particular, look at the tm package. Here are some relevant links:

Another example of useful package for this is Gary King's readme package.

Shane