views:

178

answers:

4

Are there any good APIs and public datasets (dictionaries, phrases) for working w/ natural languages?

Specifically, do any good ones exist for working on translation between English and Korean?

+1  A: 

For English I use OpenNLP.

Unfortunately, I've never saw anything Korean-related, except Google Language Detection and Translation APIs. They're quite easy to use.

muriloq
+3  A: 

WordNet is a classic data resource for English, with semantic relationships.

Nick Fortescue
A: 

MontyLingua might come in handy for an intermediate layer between English and Korean.

Paul Reiners
A: 

The Natural Language Toolkit (NLTK) is an excellent resource if you're considering Python as a language. It incorporates lots of the stuff you'd expect in a text processing/NLP environment like parsers, stemmers and part-of-speech tagging. Documentation on it is pretty good too.

As for datasets, NLTK comes with a variety of annotated corpora and textual data sets for experimenting with.

Hope it helps, B.

bohana