views:

93

answers:

2

I have a nested simple XML structure that I load with PHP's simpleXML. Some elements of the structure contain "context" attributes.

<tab context="new_item, edit_item">
  <input type="text" context="new_item">   
  <input type="readonly" context="edit_item">
    <tab context="new_item">
    ...
    </tab>
</tab>

After loading, I need to clean the structure from all elements that do not belong to the current context.

I could of course traverse through each element but maybe somebody knows a quick, SimpleXML way - possibly with XPath - to filter the structure accordingly?

Note that "context" is a comma-separated list of values, however I could change that into a more parseable form:

context_new_item="yes" context_edit_item = "no"

if necessary.

I'm sifting through the simpleXML documentation myself now, it's just not the most expansive part of the PHP documentation...

Update: This post is hardly 13 minutes old, and already 2nd on Google for "simplexml filtering". Damn, I'm impressed.

+1  A: 

If you have the value of "context" in your PHP application, you could select:

$context = "new_item";
$xpath = "//*[not(contains(concat(',', normalize-space(@context), ','), ',$context,'))]";

Now you have selected everything that is not in the desired context.

Now if you had this structure:

<tab context="new_item, edit_item">
  <context name="new_item" />
  <context name="edit_item" />
  <input type="text">
    <context name="new_item" />
  </input>
  <input type="readonly">
    <context name="edit_item" />
  </input>
  <tab>
    <context name="new_item" />
    ...
  </tab>
</tab>

You could do it simpler and more efficient:

$context = "new_item";
$xpath = "//*[not(context[@name='new_item'])]";

You could also use dedicated attributes if the number of possible contexts is limited.

$context = "new_item";
$xpath = "//*[not(context_$context = 'yes')]";
Tomalak
Wrt context, using children to enumerate contexts makes it easier to match and less risky to update. On the other hand it's more verbose, so you may prefer using namespaced attributes, e.g. `<input type="readonly" context:edit_item="1" />`
Josh Davis
Both great answers, thanks and I will now look how I build them in.
Pekka
+1  A: 

If you have to filter through the whole document then XPath is the way to go. The problem is SimpleXML can't remove arbitrary nodes like this, so you'd have to convert them to DOM then use parentNode->removeChild()

I am maintening a library that does that kind of things, SimpleDOM. Here's how I'd do it:

include 'SimpleDOM.php';

$tab = simpledom_load_string(
    '<tab context="new_item,edit_item">
      <input type="text" context="new_item" />
      <input type="readonly" context="edit_item" />
        <tab context="new_item">
        ...
        </tab>
    </tab>'
);

$context = 'new_item';

// will match ",new_item," to ",new_item,edit_item,"
$tab->deleteNodes('//*[contains(concat(",", @context, ","), ",' . $context . ',")]');

echo $tab->asXML();

Note that it will not delete the root node, as it would make the document invalid. If you don't want to depend on an external library, feel free to take a look at the source code and copy/paste what you need.

A note about the XPath expression: if the values are separated by commas, make sure there's nothing else than commas (no spaces) and enclose both the attribute's value and the value you're matching it to between commas.

Josh Davis
The xpath-only approach didn't work because it didn't preserve my tree structure. I am using simpledom now, works great - thanks.
Pekka