This gives me a frequency of words in a text:
fullWords = re.findall(r'\w+', allText)
d = defaultdict(int)
for word in fullWords :
d[word] += 1
finalFreq = sorted(d.iteritems(), key = operator.itemgetter(1), reverse=True)
self.response.out.write(finalFreq)
This also gives me useless words like "the" "an" "a"
My question is, is there a stop words library available in python which can remove all these common words? I want to run this on google app engine