views:

31

answers:

2

I want to find seed data to load into my dictionary. I just need the word's orthographic representation (letters) and the definition.

I'm looking for a single text file that contains this information to parse and then load into my db. I'm using rails so if there is a gem or plugin that can do this that would be nice if any knows about it.

+1  A: 

One such database I know of is WordNet, but that is not just one text file. You would have to parse out what you need into the format you want.

There are others as well. GCIDE is an XML based database that includes not only definitions from WordNet but also some from the 1913 edition of Webster's Revised Unabridged Dictionary (the latter now public domain in the U.S.).

idealmachine
the wordnet db is written in prolog, not sure how prolog is actually finding the words. Looks like the words and definitions are converted into numbers and then prolog knows how to read it, I'm not sure.
Sam
+3  A: 

Here you go:

http://www.gutenberg.org/ebooks/673

It might have more info than you need, but you can parse out what you want. Project Gutenberg converts public domain (including expired copyright) books to text form.

Jaime Bellmyer
I will check it out, 50 Mb file :)
Sam