ansaurus

Question

simplehtmldom php: How do you search for one thing or another

Answer 1

A:

You could load the DOM into a simplexml class and then use xpath, like so:

$xml = simplexml_import_dom($simple_html_dom);

$goodies = $xml -> xpath('//[@bgcolor = "#ffffff"] | //[@bgcolor = "#cccccc"]');

you might even be able to put that OR syntax within the same set of brackets, but I'd need to double check.

Update:

Sorry, I thought you were talking about the DOM extension. I just looked up simpledomhtml, and it appears that its find feature is loosely based on XPath. why not just do:

$goodies = $html -> find('[bgcolor=#ffffff], [bgcolor="#cccccc]');

Anthony 2009-07-25 01:48:45

I don't understand. What is $simple_html_dom. When do I call the find method and what do I pass in.

2009-07-25 02:00:09

$simple_html_dom would be the variable you you had your simplehtmldom set to, so whatever you were using for the find method originally. But now that I'm looking at the extension, I'm unsure if it uses the DOM extension as the foundation and thus if my first answer would apply. And you wouldn't apply the find method, in my orignal answer, the xpath method does the finding. It passes all the results of the xpath query to the $goodies variable, which you could then traverse and import each result back as xml or html (which I didn't mention, sorry). But I think ...

Anthony 2009-07-25 02:05:35

my second, more informed suggestion should do the trick, unless I'm understanding how simplehtmldom works or what you are looking to do with it.

Anthony 2009-07-25 02:06:09

I am still not getting what I want. Is it possible to get the data between two tags ie <tr> </tr>? How would I do that?

2009-07-25 02:23:33

Do you mean its not returning the descendants of the <tr> nodes? Try a quick experiment. Create an HTML file called test.html, inside of it put <ul id="mainlist"><li>stuff</li><li><ul id="sub_list"><li>sub-stuff</li><li>more-sub-stuff</li></ul></li></ul> And do: find('ul[id=mainlist] li'); If it should catch the first li with no problem, but if it's not set up to include children in the find results, then it won't show you the contents of the second li, which is another ul. If that's the case, I'll tell you what else I find (still reading on it).

Anthony 2009-07-25 02:46:08

Here is my code:<?phpinclude_once 'simple_html_dom.php'; $url = "test.html"; $html = file_get_html($url); foreach($html->find('ul[id=mainlist] li') as $li) { echo $li->plaintext."<br /> \n"; }?>Here is what I get:stuffsub-stuffmore-sub-stuffsub-stuffmore-sub-stuff

2009-07-25 02:58:02

So it is returning the children in my example. Real quick, is all of the data you want inside <td> tags? Could you just use find->('[bgcolor=#ffffff] td') ?

Anthony 2009-07-25 03:59:20

And while I think this is a neat add on that you are using, you may want to consider looking into what it's build on which is the DOM extension in PHP. Or looking at how to use DOM with simplexml. You would get the results you wanted even if the syntax wasn't as clean. And are you wanting the data in the tr tags, or the HTML?

Anthony 2009-07-25 04:02:12

ansaurus

tags:

views:

answers:

simplehtmldom php: How do you search for one thing or another

related questions