I'm looking for a RegEx to return either the first [n] words in a paragraph or, if the paragraph contains less than [n] words, the complete paragraph is returned.
For example, assuming I need, at most, the first 7 words:
<p>one two <tag>three</tag> four five, six seven eight nine ten.</p><p>ignore</p>
I'd get:
one two <tag>three</tag> four five, six seven
And the same RegEx on a paragraph containing less than the requested number of words:
<p>one two <tag>three</tag> four five.</p><p>ignore</p>
Would simply return:
one two <tag>three</tag> four five.
My attempt at the problem resulted in the following RegEx:
^(?:\<p.*?\>)((?:\w+\b.*?){1,7}).*(?:\</p\>)
However, this returns just the first word - "one". It doesn't work. I think the .*? (after the \w+\b) is causing problems.
Where am I going wrong? Can anyone present a RegEx that will work?
FYI, I'm using .Net 3.5's RegEX engine (via C#)
Many thanks