views:

48

answers:

1

Hi,

I would like to build a language model for a text corpus. Are there good out-of-the-box toolkits which will alleviate my task? The only toolkit I know off is the Statistical Language Modelling(SLM) Toolkit by CMU.

Regards,

+2  A: 

NLTK is very powerful, though I've never used it.

Ned Batchelder
@Ned +1Natural Language Processing Toolkit is your best choice. Download from nltk.org or buy the book from the Oreilly site. It is close to a must-have. IMO.
jim mcnamara
I have used NLTK in the past but language models using NLTK is something which I never knew about.
Denzil
http://nltk.googlecode.com/svn/trunk/doc/api/nltk.model.ngram.NgramModel-class.htmlI finally got hold of the class but there seems to be no documentation for the same !
Denzil
I can safely say NLTK is not really powerful after all. Reason: http://code.google.com/p/nltk/issues/detail?id=232To be honest, it is absolutely disappointing to try doing something which is a "basic" model in machine learning and not just NOT implemented in NLTK but very few toolkits in popular languages like Java/Python around.
Denzil