tags:

views:

36

answers:

2

Consider this regex: <(.*)>

Applied against this string:

<2356> <my pal ned> <!@%@>

Obviously, it will match the entire string because of the greedy *. The best solution would be to use a non-greedy quantifier, like *?. However, many languages and editors don't support these.

For simple cases like the above, I've gotten around this limitation with a regex like this: <([^>]*)>

But what could be done with a regex like this? start (.*) end

Applied against this string:

start 2356 end start my pal ned end start !@%@ end

Is there any recourse at all?

+5  A: 

If the end condition is the presence of a single character you can use a negative character class instead:

<([^>]*)>

For more complexes cases where the end condition is multiple characters you could try a negative lookahead, but if lazy matching is not supported the chances are that lookaheads won't be either:

((?!end).)*

Your last recourse is to construct something horrible like this:

(en[^d]|e[^n]|[^e])*
Mark Byers
I hadn't thought of lookaheads. You're right about them also likely not being supported, but regex implementations are consistently surprising. That's one answer.
Ipsquiggle
Haha! That is horrible. But it's general, and sometimes you gotta do what you gotta do. Very clever.
Ipsquiggle
+2  A: 

I replace . with [^>] where > in this case is the next character in the RE.

Mark Ransom
Yes, that's already part of the question...
Ipsquiggle
And since there were no edits, it was part of the question from the beginning. I need to work on my reading comprehension.
Mark Ransom