views:

186

answers:

2

I'm using PHP Simple HTML DOM Parser to scrape some data of a webshop (also running XAMPP 1.7.2 with PHP5.3.0), and I'm running into problems with <tbody> tag. The structure of the table is, essentialy (details aren't really that important):

<table>
  <thead>
    <!--text here-->
  </thead>
  <tbody>
    <!--text here-->
  </tbody>
</table>

Now, I'm trying to get to the <tbody> section by using code:

$element = $html->find('tbody',0)->innertext;

It doesn't throw any errors, it just prints nothing out when I try to echo it. I've tested the code on other elements, <thead>, <table>, even something like <span class="price"> and they all work fine (ofcourse, removing ",0" fails the code). They all give their correct sections. Outertext ditto. But it all fails on <tbody>.

Now, I've skimmed through the Parser, but I'm not sure I can figure it out. I've noticed that <thead> isn't even mentioned, but it works fine. shrug

I guess I could try and do child navigation, but that seems to glitch as well. I've just tried running:

$el = $html->find('table',0);
$el2 = $el->children(2);
echo $el2->outertext;

and no dice. Tried replacing children with first_child and 2 with 1, and still no dice. Funny, though, if I try ->find instead of children, it works perfectly.

I'm pretty confident I could find a work-around the whole thing, but this behaviour seems odd enough to post here. My curious mind is happy for all the help it can get.

A: 

Make sure your tbody is coming from some javascript execution. I was facing the same problem with a span tag. Later I found that, if any html code is getting into the page via jquery/any other javascript execution then in that case simple_html_dom simply fails.

Prabhas Gupte
A: 

in simple_html_dom.php file comment or remove line #396

// if ($m[1]==='tbody') continue;