views:

209

answers:

2

I'm working on a project that involves converting a large amount of HTML content to plain/text. I have a custom-written module that does the job OK, but I'm wondering if there's some standard tools to help get the job done.

+1  A: 

Html2Text seems to be a good option

Chris Ballance
+3  A: 

Two python libraries which do HTML parsing:

There are lots of opinions (see the Google) about which of these is better for what.

tcarobruce