ansaurus

Question

Matching a regex in html, ignoring spaces, and quotation marks

Answer 1

+2 A:

How much of that text is needed to uniquely identify the target? I would try this first:

@"(?is)<center>\s*This\s+page\s+has\s+been\s+visited.*?</center>"

Alan Moore 2009-03-04 01:32:41

you read my mind :) Thank you.

Alex Baranosky 2009-03-04 01:36:16

Would you mind explaining (?is:)?

Alex Baranosky 2009-03-04 01:37:04

Ignore case (i) and single line (s)--e.g. don't worry about capitalization and line breaks.

MarkusQ 2009-03-04 01:38:24

I just realized the colon isn't needed when you use it the way I did, so I removed that. Here's a complete explanation: http://www.regular-expressions.info/modifiers.html

Alan Moore 2009-03-04 01:51:09

you 100% sure this regex will work? It isn't finding any matches, or I am messing something up :)

Alex Baranosky 2009-03-04 01:57:09

Oh, I hadn't seen the above comment :)

Alex Baranosky 2009-03-04 01:57:53

It works like a charm... so far :) Thanks a lot!

Alex Baranosky 2009-03-04 01:58:58

Answer 2

+1 A:

It really depends on how simple you can make the regex and match the desired elements.

<center>[^<]+<img[^>]+>[^>]+</center>

Use the case-insensitive flag too (I don't know what C# uses). If you need something more developed because you'll have situations where an img tag sits within center tags and not match, then you can start hardcoding phrases like the other answer.

qpingu 2009-03-04 02:04:44

Answer 3

A:

In C# you could simply use this, assuming that originalHTML contains your whole HTML file.

string result = null;
result = Regex.Replace(originalHtml,
                       @"(\s*<center>[^<]*<img src=[^""].*?>.*?</center>\s*)", 
                       "", 
                       RegexOptions.Singleline | RegexOptions.IgnoreCase);

The Regex will remove any occurrence of the pattern in the original HTML and return the modified version.

Renaud Bompuis 2009-03-04 02:43:45

Answer 4

A:

I ought you to test RegExBuddy (not free but low price) because this tool saved me a lot of time.

Hope this helps.

labilbe 2009-03-04 04:26:37

ansaurus

tags:

views:

answers:

Matching a regex in html, ignoring spaces, and quotation marks

related questions