views:

34

answers:

1

I want to drill down into my html, specifically I want to get the first html table that is AFTER a form that looks like:

<form method="POST" action="/parts.html">

..

<table ...>
...

</table>

..

</form>

So this table has <tr> for each product.

My utlimate goal here is to loop through each tablerow, and then I need to extract the product name, price, image url, etc.

What should my strategy be, and what methods in beautiful soup should I be focusing on?

+1  A: 

Keep reading.

http://www.crummy.com/software/BeautifulSoup/documentation.html#Iterating%20over%20a%20Tag

http://www.crummy.com/software/BeautifulSoup/documentation.html#nextSibling%20and%20previousSibling

S.Lott
but to get the actual text from the table cell's, I'll probably need some regex correct?
Blankman
@Blankman: No. Keep reading. The text is an attribute of the node. Beautiful Soup does all the parsing for you. No regex required.
S.Lott