Hello,
I Begin with textmining. I have two database tables with thousands of data..
a table for "skills" and a table for "skills categories"
- every "skill" belongs to a skills categorie.
- a "skill" is , physicaly, a varchar(200) field in the database, where there is some text describing the skill.
Here are some skills extracted from the skills table:
"PHP (good level), Java (intermediaite), C++" "PHP5" "project management and quality management" "begining Javascript" "water engineering" "dfsdf zerze rzer" "cibling customers"
what i want to do is to extract knowledge from those fields, i mean extract only the real skill and ignore the rest of useless text. for the above example i want to get only an array with:
"PHP" "Java" "C++" "PHP5" "project management" "quality management" "Javascript" "water engineering" "cibling customers"
what should i do to extract the skills from tons of data please ? do you know specific algorithms to do this ? ex : k-means ... ?
Thanks in advance.