ansaurus

Question

Help with negative lookbehind in regular expressions

Answer 1

+1 A:

A couple of thoughts:

Do you need to escape the . in the regex? I don't know the <! syntax and don't have my books to hand so this may be a moot point.
I don't see how it would match http://www.foo.com/something as there is no / after the www.foo.com in your example.

Hope some of that is of help.

DeletedAccount 2009-07-15 21:49:12

Answer 2

A:

I would try this

content = Regex.Replace(content,"(?<!" + newDomain + ")^[^/]+/(?=" + match + ")", newDomain + match);

This will match (and thus replace the domain part on the expression) only is the domain is not newDomain and the path is match.

Simeon Pilgrim 2009-07-15 22:38:54

why the down vote? does it not solve the problem, if so, please explain why? We are not psychic debuggers

Simeon Pilgrim 2009-07-16 21:58:46

Answer 3

A:

Maybe I'm missing something, but should you be using negative lookbehinds at all? A lookbehind, by nature, will not match anything. Whereas you are wanting to match the domain and the path, and then replace the domain. Right?

So it should be something more like this:

Regex.Replace("http://www.foo.com/something", "(http://www.foo.com/)(something)", "http://www.abc.com/$2")

The idea is to use grouping to your advantage. That's where the $2 part will grab the second half of the match (the path) and append it to the new domain. I tested this in Regex Hero (a .NET regex tester) and it works. By the way, The Regex Coach is Perl-based and you may run into some difference when comparing to the .NET regex engine.

Steve Wortham 2009-07-15 22:52:59

Answer 4

+1 A:

I will try a third angle.

I think you are confusing that fact your regex "matches" something in regex coach, with it matching the part you want. Therefore you are surprised by the replace results.

the replace swaps all the matched input for the new token.

the negative lookbehind makes sure the pattern is not present, but the pattern is not part of the matched input.

the results you are getting is because only the path (your match string) of your URL is the matched input and you are replacing this with the newDomain variable.

That is why you are getting the results you are getting.

Simeon Pilgrim 2009-07-16 22:07:00

ansaurus

tags:

views:

answers:

Help with negative lookbehind in regular expressions

related questions