tags:

views:

110

answers:

3

Is there a way to get single regex to satisfy this condition??

I am looking for a "word" that has three letters from the set MBIPI, any order, but MUST contain an I.

ie.

re.match("[MBDPI]{3}", foo) and "I" in foo

So this is the correct result (in python using the re module), but can I get this from a single regex?

>>> for foo in ("MBI", "MIB", "BIM", "BMI", "IBM", "IMB", "MBD"):
...     print foo,
...     print re.match("[MBDPI]{3}", foo) and "I" in foo
MBI True
MIB True
BIM True
BMI True
IBM True
IMB True
MBD False

with regex I know I can use | as a boolean OR operator, but is there a boolean AND equivalent?

or maybe I need some forward or backward lookup?

+3  A: 

Or is about the only thing you can do:

\b(I[MBDPI]{2}|[MBDPI]I[MBDPI]|[MBDPI]{2}I)\b

The \b character matches a zero-width word boundary. This ensures you match something that is exactly three characters long.

You're otherwise running into the limits to what a regular language can do.

An alternative is to match:

\b[MBDPI]{3}\b

capture that group and then look for an I.

Edit: for the sake of having a complete answer, I'll adapt Jens' answer that uses Testing The Same Part of a String for More Than One Requirement:

\b(?=[MBDPI]{3}\b)\w*I\w*

with the word boundary checks to ensure it's only three characters long.

This is a bit more of an advanced solution and applicable in more situations but I'd generally favour what's easier to read (being the "or" version imho).

cletus
+3  A: 

You can fake boolean AND by using lookaheads. According to http://www.regular-expressions.info/lookaround2.html, this will work for your case:

"\b(?=[MBDPI]{3}\b)\w*I\w*"
Jens
It probably needs word boundary checks on it but otherwise +1, clever solution.
cletus
I'll edit that in...
Jens
great link, thanks.
+3  A: 

You could use lookahead to see if an I is present:

(?=[MBDPI]{0,2}I)[MBDPI]{3}
Bart Kiers