ansaurus

Question

RegExp: want to find all links that do not end in ".html"

Answer 1

+1 A:

Edit: Notepad++ using SciTE regular expression engine and it does not support look around expressions.

For more info take a look here http://www.scintilla.org/SciTERegEx.html

Original Answer

~~^.*(?<!\.html)$~~

S.Mark 2010-03-25 11:12:06

Answer 2

+1 A:

That regular expression would work fine, if you were using PERL or PCRE (e.g. preg_match in PHP). However, lookahead and lookbehind assertions are not supported by most, especially the more simple, regular expression engines, like one that is used by the Notepad++. Only the most basic syntax such as quantifiers, subpatterns and characters classes are supported by almost all regular expression engines.

You can find the documentation for the notepad++ regular expression engine at: http://notepad-plus.sourceforge.net/uk/regExpList.php

Rithiur 2010-03-25 11:19:44

Answer 3

+1 A:

You can make a regexp that does it, but it would probably be too complex:

href=\"((([^"]*)([^h"][^"][^"][^"]|[^t"][^"][^"]|[^m"][^"]|[^l]))|([^"]|)([^"]|)([^"]|))\"

jpalecek 2010-03-25 12:08:25

Answer 4

A:

Thank you all very much.

In the end the regular expression did indeed not work.

I simply used a workaround, and replaced all links with themselves+".html", then replaced all occurences of ".html.html" with ".html".

So I replaced href=\"([^"]*)\" with href="\1.html" and then .html.html with .html

Thanks anyway, grovel

grovel 2010-03-25 13:12:40

ansaurus

tags:

views:

answers:

RegExp: want to find all links that do not end in ".html"

related questions