ansaurus

Question

Create array from the contents of <div> tags in php

Answer 1

A:

You probaly need to use preg_match_all()

$matches = array();
preg_match_all('`\<div(.*?)class\=\"content\"(.*?)\>(.*?)\<\/div\>`iUsm',$html,$matches,PREG_SET_ORDER);
foreach($matches as $m){
  // $m[3] represents the content in <div class="content">
}

thephpdeveloper 2009-10-20 04:30:45

-1 Regexes to process serverside HTML is an awful suggestion.

cletus 2009-10-20 04:44:15

What happens if the xml contains two spaces between `div` and `class`, or an extra `id` field? If find this solution rather brittle.

xtofl 2009-10-20 04:44:40

It's good enough solution depending on the task. Converting HTML to XML also has its pitfalls.

serg 2009-10-20 05:58:45

Who said anything about converting HTML to XML? Dealing with an HTML DOM has **way less** "pitfalls than regexes, which are for this task nothing more than a dirty hack.

cletus 2009-10-20 06:40:53

Answer 2

A:

There not much you can do short of using string manipulations function or regular expressions. you can load your HTML as XML using the DOM library and use that to traverse to your div, but that can become cumbersome if your not careful or if the structure is complex.

http://ca3.php.net/manual/en/book.dom.php

Laurent Bourgault-Roy 2009-10-20 04:36:33

'could', 'cumbersome', ... think positive, man! There's a solution to every problem!

xtofl 2009-10-20 04:45:53

Answer 3

A:

It looks like Kalem13 beat me to it, but I agree. You could use the DOMDocument class. I haven't used it personally, but I think it would work for you. First you instantiate a DOMDocument object, then you load your $html variable using the loadHTML() function. Then you can use the getElementsByTagName() function.

Abinadi 2009-10-20 04:38:12

Answer 4

+2 A:

Assuming this is just a simplified case in the OP and the real situation is more complicated, you'll want to use XPath.

If it's really complex, then you may want to use DOMDocument (with DOMXPath), but here's a simple example using SimpleXML

$xml = new SimpleXMLElement($html);

$result = $xml->xpath('//div[@class="content"]');

while(list( , $node) = each($result)) {
    echo $node,"\n";
}

Since you explicitly asked about creating an array for this, you could use:

$res_Arr = array();
while(list( , $node) = each($result)) {
    $res_Arr[] = $node;
}

and $res_Arr would be an array with the contents you're looking for.

See http://php.net/manual/en/simplexmlelement.xpath.php for php SimpleXML Xpath info and http://www.w3.org/TR/xpath for the XPath specifications

Jonathan Fingland 2009-10-20 04:38:58

heck, you can even use an `XSLTransform` to get the output directly! But that, of course, lifts you out of PHP completely...

xtofl 2009-10-20 04:47:24

Answer 5

A:

PHP has several means of processing HTML, including DomDocument and SimpleXML. See Parse HTML With PHP And DOM. Here is an example:

$dom = new DomDocument; 
$dom->loadHTML($html); 
$dom->preserveWhiteSpace = false; 
$divs = $dom->getElementsByTagName('div'); 
foreach ($divs as $div) {
  $class = $div->getAttribute('class');
  if ($class == 'content') {
    echo $div->nodeValue . "\n";
  }
}

Technically the class attribute could be multiple classes so you might want to use:

$classes = explode(' ', $class);
if (in_array('content', $classes)) {
  ...
}

The SimpleXML/XPath approach is more concise but if you don't want to go the XPath route (and learning another technology, at least enough to do these sorts of tasks) then the above is a programmatic alternative.

cletus 2009-10-20 04:47:19

ansaurus

tags:

views:

answers:

Create array from the contents of <div> tags in php

related questions