This is a soup from a WordPress post detail page:
content = soup.body.find('div', id=re.compile('post'))
title = content.h2.extract()
item['title'] = unicode(title.string)
item['content'] = u''.join(map(unicode, content.contents))
I want to omit the enclosing div
tag when assigning item['content']
. Is there any way to render all the child tags of a tag in unicode? Something like:
item['content'] = content.contents.__unicode__()
that will give me a single unicode string instead of a list.