views:

127

answers:

2

I am writing an HTML document with BeautifulSoup, and I would like it to not split inline text (such as text within the <p> tag) into multiple lines. The issue that I get is that parsing the <p>a<span>b</span>c</p> with prettify gives me the output

<p>
  a
<span>
b
</span>
c
</p>

and now the HTML displays spaces between a,b,c, which I do not want. How do I avoid this?

+1  A: 

How about not using prettify at all?

BeautifulSoup.BeautifulSoup('<p>a<span>b</span>c</p>').renderContents()

outputs the original HTML with no extra spaces. You can use e.g. Firebug to have a closer look at the document's structure later with no need to 'prettify' it at construction time.

Michał Marczyk
A: 

I'd just do:

from BeautifulSoup import BeautifulSoup

ht = '<p>a<span>b</span>c</p>'
soup = BeautifulSoup(ht)
print soup

and avoid getting any extra whitespace. prettify's job is exactly to adjust whitespace to clearly show the HTML parse tree's structure, after all...!

Alex Martelli