I'd like to match the contents within each paragraph in html using a python regular expression. These paragraphs always have BR tags inside them like so:
<p class="thisClass">this is nice <br /><br /> isn't it?</p>
I'm currently using this pattern:
pattern = re.compile('<p class=\"thisClass\">(.*?)<\/p>')
Then I'm using:
pattern.findall(html)
to find all the matches. However, it only matches two of 28 paragraphs I have, and it looks like that's because those two don't have BR tags inside of them and the rest do. What am I doing wrong? What can I do to fix it? Thanks!