ansaurus

Question

Parsing an HTML file with selectorgadget.com

Answer 1

+1 A:

Inspecting the page, I can see that the specifications are placed in a div with the ID pcraSpecs:

<div id="pcraSpecs">
  <script type="text/javascript">...</script>
  <TABLE cellpadding="0" cellspacing="0" class="specification">
    <TR>
      <TD colspan="2" class="title">Model</TD>
    </TR>
    <TR>
      <TD class="name">Brand</TD>
      <TD class="desc"><script type="text/javascript">document.write(neg_specification_newline('Intel'));</script></TD>
    </TR>
    <TR>
      <TD class="name">Processors Type</TD>
      <TD class="desc"><script type="text/javascript">document.write(neg_specification_newline('Desktop'));</script></TD>    
    </TR>
    ...
  </TABLE>
</div>

desc is the class of the table cells.

What you want to do is to extract the contents of this table.

soup.find(id="pcraSpecs").findAll("td") should get you started.

Can Berk Güder 2009-02-26 23:40:38

Answer 2

A:

Have you tried using Feedity - http://feedity.com for creating a custom RSS feed from any webpage.

2009-02-27 02:58:36

ansaurus

tags:

views:

answers:

Parsing an HTML file with selectorgadget.com

related questions