tags:

views:

41

answers:

1

Hi,

i have a c# regex, which gives me all uri's in a document. it's this:

<a[^>]*\shref=[\""\'][^>]*"

this one works, but i want to exclude al uri's (matches) which have the word 'doubleclick.net' in it, because those uri's i want to leave untouched, and the others i want to add some code to.

i've tried this: ((?!doubleclick.net).) somewhere in between found here http://bloggingabout.net/blogs/arjen/archive/2008/12/03/regex-exclude-lines-containing-a-specific-word.aspx but it doesn't work for me....

Michel

+3  A: 

Please don't use regexes to parse HTML!

Grab a copy of the HTML agility pack and your life will be much simpler, and your application much less brittle.

jvenema
hmm, in my case, which has kind of fixed or known html, the regex works. So i appreciate your concern, but it doesn't solve my problem
Michel