Do you now any tool or library to split german compound words like "Hochhaus" into single words ("Hoch", "Haus").
It would be great if it's open source and/or cheap.
Do you now any tool or library to split german compound words like "Hochhaus" into single words ("Hoch", "Haus").
It would be great if it's open source and/or cheap.
Such things require a dictionary. This is not the thing you can do after work; this requires a lot of effort. However, there are some public libraries – probably there is a German community around OpenOffice with such a dictionary.
You can have a look at aspell but it does it the other way around: You can tell it that compound words are OK and it will not flag them anymore when it can find both words in the dictionary.
If you can't use aspell, then the dictionary files might be of use for you.