ansaurus

Question

Regular expression to match closing HTML tags

Answer 1

+1 A:

Don't use regex to parse HTML. It will only give you headaches.

Use an XML parser instead. Try BeautifulSoup or lxml.

NullUserException 2010-08-19 17:17:36

I've seen BeautifulSoup but I'm also a minimalist, so I've preferred using only what ships with Python. I think my issue here is enough to make me reconsider it. Thanks!

kevin628 2010-08-19 17:26:51

Answer 2

+6 A:

Read:
- RegEx match open tags except XHTML self-contained tags
- Can you provide some examples of why it is hard to parse XML and HTML with a regex?
Repent.
Use a real HTML parser, like BeautifulSoup.

delnan 2010-08-19 17:19:10

Answer 3

A:

<TAG\b[^>]*>(.*?)</TAG>

Matches the opening and closing pair of a specific HTML tag.

<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)</\1>

Will match the opening and closing pair of any HTML tag.

See here.

pavanlimo 2010-08-19 17:21:18

ansaurus

tags:

views:

answers:

Regular expression to match closing HTML tags

related questions