I'm helpless on regular expressions so please help me on this problem.
Basically I am downloading web pages and rss feeds and want to strip everything except plain words. No periods, commas, if, ands, and buts. Literally I have a list of the most common words used in English and I also want to strip those too but I think I know how to do that and don't need a regular expression because it would be really way to long.
How do I strip everything from a chunk of text except words that are delimited by spaces? Everything else goes in the trash.
This works quite well thanks to Pavel .split(/[^[:alpha:]]/).uniq!