ansaurus

Question

What regex would match a nested table with identifiable text in the table cell?

Answer 1

+1 A:

Don't use a regex. Use an HTML parser!

However, in Perl (assuming you don't have nested tables):

$xml =~ /<table>.*<td>Code2<\/td>.*<\/table>/s;

tster 2009-10-01 17:22:32

Don't use an XML parser, use a **HTML** parser!

Peter Boughton 2009-10-01 17:25:52

(unless of course you can be certain the content is valid XHTML)

Peter Boughton 2009-10-01 17:27:31

Thanks, edited the answer.

tster 2009-10-01 17:28:15

Answer 2

+5 A:

I wouldn't use a regexp on this, since HTML isn't regular, and there are no end of edge cases to trip you up. You're better off using an HTML parser. Whichever language or platform you're using, there'll be one available.

Brian Agnew 2009-10-01 17:22:57

Answer 3

+2 A:

The following regex will find your table:

(?ms)<table>((?!<table>).)*<td>Code2</td>.*?</table>

With (?ms) you turn on "multiline matches" (m) and "dot matches newlines, too" (s). Then you have a negative lookahead (?!) to make sure you have no second start of a table inside your match.

tangens 2009-10-01 19:53:35

ansaurus

tags:

views:

answers:

What regex would match a nested table with identifiable text in the table cell?

related questions