views:

218

answers:

2

I have a complex html DOM tree of the following nature:

<table>
    ...
    <tr>
        <td>
            ...
        </td>
        <td>
            <table>
                <tr>
                    <td>
                        <!-- inner most table -->
                        <table>
                            ...
                        </table>

                        <h2>This is hell!</h2>
                    <td>
                </tr>
            </table>
        </td>
    </tr>
</table>

I have some logic to find out the inner most table. But after having found it, I need to get the next sibling element (h2). Is there anyway you can do this?

+2  A: 

If tag is the innermost table, then

tag.findNextSibling('h2')

will be

<h2>This is hell!</h2>

To literally get the next sibling, you could use tag.nextSibling, which in this case, is u'\n'.

If you want the next sibling that is not a NavigableString (such as u'\n'), then you could use

tag.findNextSibling(text=None)

If you want the second sibling (no matter what it is), you could use

tag.nextSibling.nextSibling

(but note that if tag does not have a next sibling, then tag.nextSibling will be None, and tag.nextSibling.nextSibling will raise an AttributeError.)

unutbu
didnt mean to say find 'h2'...it could be anything. How to get whtever is next...?
deostroll
`tag.findNextSibling(text!=u'\n')` is not valid Python. You might have meant `tag.findNextSibling(text=lambda x: not x.isspace())`.
Max Shawabkeh
@Max: Thanks for pointing out my error. `not x.isspace()` unfortunately doesn't work because the `text` keyword argument only applies to NavigableStrings, which the `<h2>...</h2>` tag is not. So, I edited my answer to suggest `text=None` which skips all NavigableStrings.
unutbu
A: 

Every tag object has a [nextSibling][1] attribute that's exactly what you're looking for -- the next sibling (or None for a tag that's the last child of its parent tag, of course).

[1]: http://www.crummy.com/software/BeautifulSoup/documentation.html#nextSibling and previousSibling

Alex Martelli