views:

103

answers:

2

Hello, I'm using BeautifulSoup - python module. I have to find any reference to the div's with id like: 'post-#'. For example:

<div id="post-45">...</div>
<div id="post-334">...</div>

How can I filter this?

html = '<div id="post-45">...</div> <div id="post-334">...</div>'
soupHandler = BeautifulSoup(html)
print soupHandler.findAll('div', id='post-*')
> []
+3  A: 

You can pass a function to findAll:

>>> print soupHandler.findAll('div', id=lambda x: x and x.startswith('post-'))
[<div id="post-45">...</div>, <div id="post-334">...</div>]

Or a regular expression:

>>> print soupHandler.findAll('div', id=re.compile('^post-'))
[<div id="post-45">...</div>, <div id="post-334">...</div>]
Mark Byers
AttributeError: 'NoneType' object has no attribute 'startswith'
Ockonal
I've fixed the `AttributeError`.
J.F. Sebastian
+1  A: 
soupHandler.findAll('div', id=re.compile("^post-$"))

looks right to me.

Auston
Why did you put the `$`? I don't think that will work as the OP intends.
Mark Byers