views:

537

answers:

4

Lets say i have some code to iterate through an XML file recursively like this:

$xmlfile = new SimpleXMLElement('http://www.domain.com/file.xml',null,true);
xmlRecurse($xmlfile,0);
function xmlRecurse($xmlObj,$depth) {
  foreach($xmlObj->children() as $child) {
    echo str_repeat('-',$depth).">".$child->getName().": ".$subchild."\n";
    foreach($child->attributes() as $k=>$v){
        echo "Attrib".str_repeat('-',$depth).">".$k." = ".$v."\n";
    }
    xmlRecurse($child,$depth+1);
  }
}

How would i calculate the xpath of each node so i can store it for mapping to other code?

+3  A: 

You can pass to your xmlRecurse third param called $xpath (with current node xPath representation) and add xpath representation of the children on each iteration:

function xmlRecurse($xmlObj,$depth,$xpath) {
  $i=0;
  foreach($xmlObj->children() as $child) {
    echo str_repeat('-',$depth).">".$child->getName().": ".$subchild."\n";
    foreach($child->attributes() as $k=>$v){
        echo "Attrib".str_repeat('-',$depth).">".$k." = ".$v."\n";
    }
    xmlRecurse($child,$depth+1,$xpath.'/'.$child->getName().'['.$i++.']');
  }
}
Ololo
Of course, you can also build current child xPath representation using its attributes. But in that way you have to store all xpath strings in array to be sure that you have not added duplicates
Ololo
thats true, i wondered if there was something more direct and not reliant on passing vars through the recursive functions, like a $child->current() or something like that
seengee
+4  A: 

The obvious way to do it is to pass the XPath as a third parameter and build it as you dig deeper. You have to account for siblings having the same name, so you have to keep track of the number of precedent siblings with the same name as current child while iterating.

Working example:

function xmlRecurse($xmlObj,$depth=0,$xpath=null) {
  if (!isset($xpath)) {
    $xpath='/'.$xmlObj->getName().'/';
  }
  $position = array();

  foreach($xmlObj->children() as $child) {

    $name = $child->getName();
    if(isset($position[$name])) {
      ++$position[$name];
    }
    else {
      $position[$name]=1;
    }
    $path=$xpath.$name.'['.$position[$name].']';

    echo str_repeat('-',$depth).">".$name.": $path\n";
    foreach($child->attributes() as $k=>$v){
        echo "Attrib".str_repeat('-',$depth).">".$k." = ".$v."\n";
    }

    xmlRecurse($child,$depth+1,$path.'/');
  }
}

Attention though, the whole idea of mapping a whole document and storing XPath along the way seems weird. You might actually be working on the wrong solution to a totally different problem.

Josh Davis
interesting. what we're actually looking at doing is allowing a user to upload a template XML file and we need to store a mapping of what each node/attribute maps to within our system. can you recommend a better approach than xpath? What i liked about it really was the idea of a simple string storing the path
seengee
+2  A: 

With SimpleXML, I think you can only do it as others have pointed out: by recursing the node path as a string argument.

With DOMDocument, you could use the $node->parentNode property to crawl back to the document element and construct it for an arbitrary node (for example if you had a reference to a node and wanted to discover where in the tree it was without prior knowledge of how you got to that node).

MightyE
+1  A: 

Following up on MightyE's idea about backtracking:

function whereami($node)
{
    if ($node instanceof SimpleXMLElement)
    {
        $node = dom_import_simplexml($node);
    }
    elseif (!$node instanceof DOMNode)
    {
        die('Not a node?');
    }

    $q     = new DOMXPath($node->ownerDocument);
    $xpath = '';

    do
    {
        $position = 1 + $q->query('preceding-sibling::*[name()="' . $node->nodeName . '"]', $node)->length;
        $xpath    = '/' . $node->nodeName . '[' . $position . ']' . $xpath;
        $node     = $node->parentNode;
    }
    while (!$node instanceof DOMDocument);

    return $xpath;
}

I wouldn't recommend it for the case at hand (mapping a whole document, as opposed to a single given node) but it might be useful for future reference.

Josh Davis