When I parse HTML I wish to obtain only the innermost tags for the entire document. My intention is to semantically parse data from the HTML doc.
So if I have some html like this
<html>
<table>
<tr><td>X</td></tr>
<tr><td>Y</td></tr>
</table>
</html>
I want <td>X</td>
and <td>Y</td>
alone. Is this possible using Beautiful Soup or lxml?