views:

45

answers:

2

Is there a free available list of the most common english words to remove from text for creating a search index?

+2  A: 

Wikipedia gives the 100 most frequent lemmas: http://en.wikipedia.org/wiki/Most_common_words_in_English

That might be good for a start; the article provides some good references.

Hans W
+2  A: 

Here are the ones (plus characters) used in SQL Server 05 noiseword list, i assume the 08 stopwords are simular.

And the MSDN on it here

Hope this helps

Jammin