ansaurus

Question

Answer 1

+1 A:

When extracting information from HTML, it isn't recommended to just hack some regexes together. The right way to do it is to use a proper HTML parsing module. Python has several good modules for this purpose - in particular I recommend BeautifulSoup.

Don't be put off by the name - it's a serious module used by a lot of people with great success. The documentation page has a lot of examples that should help you get started with your particular needs.

Eli Bendersky 2010-09-25 05:43:24

Answer 2

+2 A:

Why don't you try using BeautifulSoup

http://www.crummy.com/software/BeautifulSoup/

Example code :

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(htmldoc)
allSpans = soup.findAll('span', class="type")
for element in allSpans:
    ....

pyfunc 2010-09-25 05:47:15

ansaurus

tags:

views:

answers:

Python regular expression slicing

related questions