views:

62

answers:

3

This is about the code editor Notepad++.

I'm looking for a regular expression that will solve the following problem:

I have a set of html files. I need to find all links in them that are not links to javascript functions. If I search for the string 'href="' I get 342 results and if I search for 'href="javascript' I get 301 results. I'd like to get at the 41 elements that are only in the first set. That is all links that are not to javascript function calls.

I'd be grateful if anyone more familiar with regular expressions than I currently am could help me out on this one.

A: 

I don't know exactly the RegExp engine of Notepad++ but the extended regular expression would look like:

href="(?:(?!javascript).)
KARASZI István
there does not seem to be a '?' in notepad++
dude
I looked here: http://sourceforge.net/apps/mediawiki/notepad-plus/index.php?title=Regular_Expressions
dude
the regexp: href="(javascript) finds all the javascript function calls but I have not managed to negate the (javascript) so far
dude
that's sad, could you use a different/better editor?
KARASZI István
sure could, any recommendations?
dude
gvim for e.g. it has a windows compatibility mode which helps the starters
KARASZI István
+1  A: 

This will match urls that don't start with "j", which probably will work for you.

href="[^j]
LatinSuD
thank you. This worked. It would exclude any relative links starting with a 'j' but since I know I was looking for 41 (from the searches described above) I know I got them all. So thanks.
dude
A: 

PowerGrep w/ RegexBuddy - I use notepad++ and PowerGrep

Wes