I've tried PorterStemmer and Snowball but both don't work on all words, missing some very common ones.
My test words are: "cats running ran cactus cactuses community communities", and both get less than half right.
Ideally the class/function would be in PHP, but I can port it if it's in another language.
See also:
Stemming algorith...
I'm preparing some table names for an ORM, and I want to turn plural table names into single entity names. My only problem is finding an algorithm that does it reliably. Here's what I'm doing right now:
If a word ends with -ies, I replace the ending with -y
If a word ends with -es, I remove this ending. This doesn't always work however...
I'm intending to use SQL version of WordNet and I have a problem finding a way to lemmatize words in order to find them in the DB; I can't use the WordNet lemmatizer itself because it is applied to the textual version of WorldNet.
I've read here that there is a good lemmatizer that returns real words - and that's exactly what I need. I...
The title says it all: Given some (English) word that we shall assume is a plural, is it possible to derive the singular form? I'd like to avoid lookup/dictionary tables if possible.
Some examples:
Examples -> Example a simple 's' suffix
Glitch -> Glitches 'es' suffix, as opposed to above
Countries -> Country 'ies' suffix....
I know dbsight allows synonyms and stop words for searching but does this take care of inflectional forms of a verb too e.g. for 'swim' it should find swim, swims, swimming, swam, and swum
Link on DBSight Wiki : http://wiki.dbsight.com/index.php?title=User%5Fdictionary
...
When do I use each ?
Also...is the NLTK lemmatization dependent upon Parts of Speech?
Wouldn't it be more accurate if it was?
...
I have tried using a stemmer but the words it produces are just not upto the mark. It could be great if you could let me know any lemmatizer script there exists for ruby or a lemmatizer gem or an SQL query that bundles out the lemma of a word in the wordnet database.
Cheers !
...
Hi all,
I'm wondering whether major SQL engines out there (MS SQL, Oracle, MySQL) have the ability to understand that 2 words are related because they share the same root.
We know it's easy to match "networking" when searching for "network" because the latter is a substring of the former.
But do SQL engines have functions that can mat...