ansaurus

Question

PHP XPATH of HTML document omitting all tags. I want to keep them

Answer 1

A:

Yes you are right, DOM parses the child elements (because they are elements and not strings), and the correct way to get data from child elements is to iterate through all of them. Implementing that would not be complicated, though.
You may want to try a different XPath expression as well, instead of

//ul[@id='adPoint1']

try

//ul[@id='adPoint1']/li

which would select elements with actual string values.
If give the expected result as well (for both the ul and the script) maybe you will get more answers.

phunehehe 2009-10-13 06:39:45

phunehehe, yes you are correct but I am looking for a solution that will maintain the tags within an element. I am really trying to get a string which contains the javascript code in its entirety.

m3mbran3 2009-10-13 06:55:21

Answer 2

A:

I decided XPATH wasn't suited for what I wanted and am now using PHP Simple HTML DOM Parser which is much better suited to the task.

It maintains internal html formatting just fine.

foreach($this->simpleDom->find('script[language=javascript]') as $script) {
  echo htmlentities($script->innertext());
}

m3mbran3 2009-10-13 10:03:07

ansaurus

tags:

views:

answers:

PHP XPATH of HTML document omitting all tags. I want to keep them

related questions