views:

30

answers:

1

hi guys,

thanks to the help with my previous homework question Regex to match tags like <A>, <BB>, <CCC> but not <ABC>, but now i have another homework question.

i need to match tags like <LOL>, <LOLOLOL> (3 uppercase letters, with repeatable last two letters), but not <lol> (need to be uppercase).

using the technique from the previous homework, i tried <[A-Z]([A-Z][A-Z])\1*>. this works, except there's an additional catch: the repeating part can be in mixed case!!!

so i need to also match <LOLolol>, <LOLOLOlol>, because it's 3 uppercase letters, with repeatable last two letters in mixed case. i know you can make a pattern case-insensitive with /i, and that will let me match <LOLolol> with the regex i have, but it will also now match <lololol>, because the check for the first 3 letters are also case-insensitive.

so how do i do this? how can i check the first 3 letters case sensitively, and then the rest of the letters case-insensitively? is this possible with regex?

thanks!

+3  A: 

Yes! You can in fact do this in some flavors, using what is called embedded modifier. This puts the modifier in the pattern, and you can essentially select which parts of the pattern the modifiers apply to.

The embedded modifier for case insensitivity is (?i), so the pattern you want in this case is:

<[A-Z]([A-Z]{2})(?i:\1*)>

References

  • regular-expressions.info/Modifiers
    • Specifying Modes Inside The Regular Expression
      • Instead of /regex/i, you can also do /(?i)regex/
    • Turning Modes On and Off for Only Part of The Regular Expression
      • You can also do /first(?i)second(?-i)third/
    • Modifier Spans
      • You can also do /first(?i:second)third/
polygenelubricants
See this in action on http://www.rubular.com/r/GmO5BYgxG8
polygenelubricants
A simpler example http://www.rubular.com/r/3JOVzH387p
polygenelubricants