views:

122

answers:

1

I'd like to search for words in the OS X system dictionary (or dictionaries) using a simple glob or regex rather than a known text. (Currently I'm using /usr/share/dict/words instead, but the OSX dict would be a lot nicer.)

The Dictionary Services interface is quite limited and doesn't allow this, but it seems like DSGetTermRangeInString might be doing something similar under the hood. Does anyone know of a way to access such functionality?

Alternatively, is there a way to extract a word list from the dictionary? I could then grep that. Some dictionaries seem to include the source XML in the bundle, which should be easy enough to parse, but (not surprisingly, I guess) the big language dictionaries only have the data in some binary format. Any clues as to what that might be?

A: 

The dictionary Apple provides in OS X is licensed from one of the major publishers. Legally, they can't let you dump the whole word list.

NSResponder
Granted -- that was what I meant when I said it was not surprising the data were only in binary form -- but that doesn't mean they couldn't be searchable.And, not knowing the contractual details, it's not clear how much protection Apple are required to provide. Are the files merely obscure or actually DRM-ed?
walkytalky
Realistically, I guess I couldn't expect any other answer. Allowing a search with unlimited results would be tantamount to publishing the word list. Limiting the results would be problematic. So unless I'm willing to reverse engineer the thing (I'm not), it looks like the BSD word list (or maybe something like SOWPODS) will have to do.
walkytalky