I am using BeautifulSoup in Python and am having trouble replacing some tags. I am finding <div>
tags and checking for children. If those children do not have children (are a text node of NODE_TYPE = 3), I am copying them to be a <p>
.
from BeautifulSoup import Tag, BeautifulSoup
class bar:
self.soup = BeautifulSoup(self.input)
foo()
def foo(self):
elements = soup.findAll(True)
for node in elements:
# ....other stuff here if not <div> tags.
if node.name.lower() == "div":
if not node.find('a'):
newTag = Tag(self.soup, "p")
newTag.setString(node.text)
node.replaceWith(newTag)
nodesToScore.append(newTag)
else:
for n in node.findAll(True):
if n.getString(): # False if has children
newTag = Tag(self.soup, "p")
newTag.setString(n.text)
n.replaceWith(newTag)
I'm getting an AttributeError:
File "file.py", line 125, in function
node.replaceWith(newTag)
File "BeautifulSoup.py", line 131, in replaceWith
myIndex = self.parent.index(self)
AttributeError: 'NoneType' object has no attribute 'index'
I do the same replacing on node
higher up in the for loop and it works correctly. I'm assuming it's having problems because of the additional iterating through node as n.
What am I doing wrong or what would be a better way to do this? Thanks! PS. I'm using Python 2.5 for Google Appengine and BeautifulSoup 3.0.8.1