Possible Duplicate:
Which CPAN module would you recommend for turning HTML into plain text?
Question:
- Is there a module to render HTML, specifically to gather the text, while adhering to font-style tags, such as
<tt>
,<b>
,<i>
, etc and break-line<br>
, similar to Lynx.
For example:
# cat test.html
<body>
<div id="foo" class="blah">
<tt>test<br>
<b>test</b><br>
whatever<br>
test</tt>
</div>
</body>
# lynx.exe --dump test.html
test
test
whatever
test
Note: the second line should be bold.