tags:

views:

88

answers:

3

Possible Duplicate:
RegEx match open tags except XHTML self-contained tags

How can I find tag with regular expression.

I have html like this:

<div class='head'> Article TExt <div> <tag> Some ather text </tag>

How mast looks like regular expression to find text inside tags DIV whith class 'head'?

+1  A: 

If you want to search in html, wouldn't it be better to use something easier, than regexp? For example PyQuery (http://pypi.python.org/pypi/pyquery).

gruszczy
PyQuery is just a wrapper for `lxml`.
katrielalex
+6  A: 

Oh my lord, another one.

Don't parse HTML with regex!

There are many libraries specifically designed to parse HTML. See e.g. BeautifulSoup or lxml. Regex is insufficiently flexible.

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup( <document> )
soup.findAll( "div", class = "head" )
katrielalex
+1  A: 

This would be trivial with PyQuery:

>>> from pyquery import PyQuery as pq
>>> html = pq("""<div class='head'> Article TExt <div> <tag> Some ather text </tag>""")
>>> print html('div.head').text()
Article TExt Some ather text
jathanism
And how to get text only from div tags? Is there some standard function?
Pol
Ohh... Sorry there is no '/' in div.... Everything is ok... Thanks! :)
Pol