tags:

views:

124

answers:

2

I cant get the data between the tags into the arrays:

// Load the HTML string from file and create a SimpleXMLElement
$html_string = file_get_contents("data/csr.html"); /*the string really is in $html_string*/
$root = new SimpleXMLElement($html_string);

Problem starts here when I try to get that the value between the tags: div, h2 and span into an array

// Fetch all div, h2 and span values
$divArray = $hdlsArray = $dtlsArray = array();
    foreach ($root->div as $div) {
    $divArray[] = $div;
    echo "".$div."<br />";
}
foreach ($root->h2 as $h2) {
    $hdlsArray[] = $h2;
    echo "".$h2."<br />";
}
foreach ($root->span as $span) {
    $dtlsArray[] = $span;
    echo "".$span."<br />";
}

The result of this is a blank page instead of printing the actual tag data

+1  A: 

This page says (about SimpleXML) "the only problem with it is that it'll only load valid XML" but may provide a workaround for HTML.

The 'Related Questions' on StackOverflow include this one, but it describes HTML inside valid XML tags.

pavium
This looks like the old code I had before trying simplexml. And furthermore that returns errors(new DOMDocument). The errors I know are because some conflict with zend extensions. This is the reason why I am using simplexml instead.Just need to get the inner data between the tags into an array.
megatr0n
Unfortunately for me, that last link had very little relevance to what I was trying to accomplish here but I really like your spirit.
megatr0n
+2  A: 

As an alternate to SimpleXMLElement, I suggest Simple HTML DOM (online manual). I've used it before and very much satisfied with the results. It allows you to use jQuery like selectors so fetching all div, h2 and span values is fairly simple.

Salman A
I didn't really wanted to go third party but I guess right now it seems the best alternative. Thanks.
megatr0n
Its open-source!
Salman A