I'm working on a project that involves converting a large amount of HTML content to plain/text. I have a custom-written module that does the job OK, but I'm wondering if there's some standard tools to help get the job done.
+3
A:
Two python libraries which do HTML parsing:
There are lots of opinions (see the Google) about which of these is better for what.
tcarobruce
2009-11-03 15:39:30