How do I write a regex to match any string that doesn't meet a particular pattern? I'm faced with a situation where I have to match an (A and ~B) pattern.
You could use a look-ahead assertion:
(?!999)\d{3}
This example matches three digits other than 999
.
But if you happen not to have a regular expression implementation with this feature (see Comparison of Regular Expression Flavors), you probably have to build a regular expression with the basic features on your own.
A compatible regular expression with basic syntax only would be:
[0-8]\d\d|\d[0-8]\d|\d\d[0-8]
This does also match any three digits sequence that is not 999
.
A[^B]
literally matches, A, and ~ B
so the following match
AC
AD
AF
and these doesn't
AB
Q
I think it works by making a character class of everything thats not B ([AC-Za-z0-9]) I believe that includes all off the asciibet.
Match against the pattern and use the host language to invert the boolean result of the match. This will be much more legible and maintainable.
This seems somewhat a basic question of Formal Languages or Theoretical Computer Science classes. I'm assuming that it is not homework based on your reputation and previous answers, so I'm answering this.
The complement of a regular language is also a regular language, but to construct it you have to build the DFA for the regular language, and make any valid state change into an error. See this for an example. What the page doesn't say is that it converted /(ac|bd)/
into /(a[^c]?|b[^d]?|[^ab])/
. The conversion from a DFA back to a regular expression is not trivial. It is easier if you can use the regular expression unchanged and change the semantics in code, like suggested before.