views:

176

answers:

4

Provide an example for the pseudo-regex: Match every url except those from example.com and example2.com according to the PHP regexp syntax.

Here is what I have so far, but it doesn't work:

$patternToMatch = "@https?://[^(example.com|example2.com)]\"*@i";
+2  A: 

Don't use regular expressions for things you don't need to.

$parts = parse_url($url);
if ($parts && $parts['host'] != 'example.com' && $parts['host'] != 'example2.com') {
    // the URL seems OK
}
Lukáš Lalinský
Lukas, I am trying to extract urls from a text document. I do not have the urls on hand. I do need regex.
darkAsPitch
+1  A: 

The problem here is that within a class definition ([]) special characters such as ( and | lose their meaning.

A better solution is to match on example.com or example2.com and then proceed only for negative tests.

Segfault
Thanks Segfault, looks like I have to remove all example.com urls, then search for any remaining urls, right? Thanks again!
darkAsPitch
+1  A: 

No, everything between square brackets will match just one character. For example the regex:

[^example]

will match any single character other than e, x, a, m, p, l and e.

Try negative lookahead:

@https?://(www\.)?(?!example2?.com)@i
Bart Kiers
A: 

Hi, You almost had the answer. This will do the matching that you want.

$patternToMatch = "@https?://(example.com|example2.com)@i";
aberpaul
Thanks aberpaul, but that looks like it would ONLY find example.com and example2.com - I want everything BUT example.com and example2.com
darkAsPitch
I am assuming (maybe wrongly that) you would be able to use PHP? to check if the match returned true/false. I can see segfault thought along these lines.
aberpaul