views:

110

answers:

1

This is the script I have:

import BeautifulSoup

if __name__ == "__main__":
    data = """
    <root>
        <obj id="3"/>
        <obj id="5"/>
        <obj id="3"/>
    </root>
    """
    soup = BeautifulSoup.BeautifulStoneSoup(data)
    print soup

When ran, this prints:

<root>
  <obj id="3"></obj>
  <obj id="5"></obj>
  <obj id="3"></obj>
</root>

I'd like it to keep the same structure. How can I do that?

+7  A: 

From the Beautiful Soup documentation:

The most common shortcoming of BeautifulStoneSoup is that it doesn't know about self-closing tags. HTML has a fixed set of self-closing tags, but with XML it depends on what the DTD says. You can tell BeautifulStoneSoup that certain tags are self-closing by passing in their names as the selfClosingTags argument to the constructor

Ben James
Thanks! Just found it too :)
Geo