ansaurus

Question

get everything inside/between html tags

Answer 1

+1 A:

You might want to use PHP Simple HTML DOM Parser

codaddict 2010-02-27 13:13:19

excellent, works great! thx

qxxx 2010-02-27 14:09:12

Answer 2

A:

Use this function:

public function innerHTML($DOMnode) {
    return preg_replace(
        '/^<(\w+)\b.*?>(.*)<\/\1?>/s',
        '$2',
        $DOMnode->ownerDocument->saveXML($DOMnode)
    );
}

stillstanding 2010-02-27 13:14:11

IA IA Cthulhu Fhtagn!!! http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

Gordon 2010-02-27 13:19:06

if you studied the code better, you'd notice you're not parsing the entire HTML page, but only the contents of the DOM node!

stillstanding 2010-02-27 13:27:07

I did study it and found it horrible to convert the DomNode to string in order to be able to run a Regex on it.

Gordon 2010-02-27 13:32:15

I see no reason why using strings would be less efficient than iterating over nodes and using appendXML and document fragments

stillstanding 2010-02-27 13:39:04

Because it's like switching from a scalpell to a spoon during surgery. If you are already using the right toolset (DOM), why abandon it when you are halfway through for one that has no clue about nodes and attributes?

Gordon 2010-02-27 13:46:46

ansaurus

tags:

views:

answers:

get everything inside/between html tags

related questions