tags:

views:

38

answers:

3

I was practicing regular expressions and attempted to write a regex which will detect "cay" and "cabby" and also "catty". I feel this is correct:

ca(([bt])\1*)?y 

but on trying this on RegexBuddy, I see that it only matches "cay". Can anyone find the problem?

thanks, Mishal

+3  A: 

You must count parentheses correctly:

ca(([bt])\2)?y 

would capture cay, cabby, catty.

The simpler:

ca(bb|tt)?y

would capture cay, cabby, catty explicitly.


PS: I thought quantifying back-references (as in \2*) was not possible, but in fact it is. If you want to match any amount of only "t" or only "b", the following would be okay:

ca(([bt])\2*)?y 

matches cay, caby, cabby, cabbbbbbbbbbbbbbbbbbbby, catttty, etc. It can be simplified to the equivalent:

ca([bt])\1*y 

because such a construct (x*)? is redundant.

Tomalak
thanks, that works :)
mishal153
+1  A: 

This should do the trick without backreferences:

ca(?:bb|tt)?y
Robert Koritnik
+1  A: 

with a noncapturing group

ca(?:bb|tt)?y

or simpler without

ca(bb|tt)?y
jigfox
This would catch BB, BT, TB and TT as well. Not just BB and TT.
Robert Koritnik
Thanks, for pointing it out. i've changed my regex.
jigfox