How can I find words like and, or, to, a, no, with, for etc. in a sentence using VB.NET and remove them. Also where can I find all words list like above.
Thanks
How can I find words like and, or, to, a, no, with, for etc. in a sentence using VB.NET and remove them. Also where can I find all words list like above.
Thanks
The easiest way is:
myString.Replace("and", "")
You'd loop over your word list and have a statement like the above. Google for a list of common English words?
List of English 2 Letter Words
List of English 3 Letter Words
You can match the words and remove them using regular expressions.
You can indeed replace your list of words using the .Replace function (as colithium described) ...
myString.Replace("and", "")
Edit:
... but indeed, a nicer way is to use Regular Expressions (as edg suggested) to avoid replacing parts of words.
As your question suggests that you would like to clean-up a sentence to keep meaningfull words, you have to do more than just remove two- and three letter words.
What you need is a list of stop-words: http://en.wikipedia.org/wiki/Stop_word
A comma seperated list of stop-words for the English language can be found here: http://www.textfixer.com/resources/common-english-words.txt
Note that unless you use Regex word boundaries you risk falling afoul of the Scunthorpe (Sfannythorpe) problem.
string pattern = @"\band\b";
Regex re = new Regex(pattern);
string input = "a band loves and its fans";
string output = re.Replace(input, ""); // a band loves its fans
Notice the 'and' in 'band' is untouched.