views:

229

answers:

2

I'd like to be able to write something like the following. Can someone show me how to write a clean WordReader class in C#. (a word is [a-zA-Z]+)

    public List<string> GetSpecialWords(string text)
    {
        string word;
        List<string> specialWords = new List<string>();
        using (WordReader wr = new WordReader(text))
        {
            while (true)
            {
                word = wr.Read();
                if (word == null) break;
                if (isSpecial(word)) specialWords.Add(word);
            }
        }
        return specialWords; 
    }

    private bool isSpecial(string word)
    {
        //some business logic here
    }
A: 

I would have read your valid word characters until you his a space or punctuation. You'll want to keep track of you index in the stream, while skipping over punctuation and spaces, and also numbers, in your case. This feels like homework, so I am going to leave the implementation up to you.

You should consider the case for hyphenated words, in your case, should they count as one or two words.

phsr
+1  A: 

Regex.Match("[a-zA-Z]+") should return you a word in the form of a Regex.Match object. You can use Regex.Matches to get all of the matched strings, or you can just do Regex.Match("[a-zA-Z]+", indexOfLastMatch) to get the next word.

MSDN: Regex object

http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx

If you're not allowed to use Regex in your homework problem, well...

Toby