tags:

views:

51

answers:

3

I'm trying to match a string that looks something like this:

<$Fexample text in here>>

with this expression:

<\$F(.+?)>{2}

However, there are some cases where my backreferenced content includes a ">", thus something like this:

<$Fexample text in here <em>>>

only matches example text in here <em in the backreference. What do I need to do to conditionally return a correct backrefernce with or without these html entities?

+3  A: 

Try

<\$F(.+?)>>(?!>)

The (?!>) forces only the last >> in a long sequence of >>>..>>> will be matched.


Edit:

<\$F(.+?>*)>>

Also works.

KennyTM
In case you are wondering, that's negative lookahead. See: [Lookaround](http://www.regular-expressions.info/lookaround.html)
NullUserException
+1: But I think <\$F(.+?)>>+ would be more performant as there's no backtracking.
CurtainDog
@Curtain: But then the extra `>` won't be in the capture group.
KennyTM
@KennyTM - Ah of course. My bad :(
CurtainDog
+4  A: 

You can add start and end anchors to the regex as:

^<\$F(.+?)>{2}$
codaddict
yep, you was 35 sec. faster +1 vote.
Bart
I don't know if you can assume this from the question, else dropping the ? would suffice
CurtainDog
I forgot to mention the absence of anchors is intentional. The string may appear anywhere within the line.
Levi McCallum
A: 

Please note than tu truly do what (I think) you want to do, you would have to interpret well-formed bracket expressions, which is not possible in a regular language.

In other words, <$Fexample <tag <tag <tag>>> example>> oh this should not happen> will return example <tag <tag <tag>>> example>> oh this should not happen as the capture group.

Michael