



This is my HTML:

<h3>test 1</h3>
<h4>subheading 1</h4>
<h4>subheading 2</h4>
<h3>test 2</h3>
<h4>subheading 3</h4>
<h3>test 3</h3>

I am trying to build an array of the h3 tags, with the h4 tags nested within them. An example of the array would look like:

    [test1] => Array
            [0] => subheading 1
            [1] => subheading 2

    [test 2] => Array
            [0] => subheading 3

    [test 3] => Array


Happy to use preg_match or DOMDocument, any ideas?

+5  A: 

With DOMDocument:

  • use XPath "//h3" to find all <h3>. These will be the first-level entries in your array
  • for each of them:
    • count a variable $i (count from 1!) as part of the loop
    • use XPath "./following::h4[count(preceding::h3) = $i]" to find any sub-ordinate <h4>
    • these will be second-level in you array

The XPath expression is "select all <h4> that have a the same constant number of preceding <h3>". For the first <h3> that count is 1, naturally, for the second the count is 2, and so on.

Be sure to execute the XPath expression in the context of the respective <h3> nodes.
