tags:

views:

68

answers:

4

i've got some elements in a xml document i want to delete. so i want to create another xml document without those elements.

here is an example of how it looks like at the moment:

<entity id="1000070">
    <name>apple</name>
    <type>category</type>
    <entities>
        <entity id="7002870">
            <name>mac</name>
            <type>category</type>
            <entities>
                <entity id="7002907">
                    <name>leopard</name>
                    <type>sub-category</type>
                    <entities>
                        <entity id="7024080">
                            <name>safari</name>
                            <type>subject</type>
                        </entity>
                        <entity id="7024701">
                            <name>finder</name>
                            <type>subject</type>
                        </entity>
                    </entities>
                </entity>
            </entities>
        </entity>
        <entity id="7024080">
            <name>iphone</name>
            <type>category</type>
            <entities>
                <entity id="7024080">
                    <name>3g</name>
                    <type>sub-category</type>
                </entity>
                <entity id="7024701">
                    <name>3gs</name>
                    <type>sub-category</type>
                </entity>
            </entities>
        </entity>
        <entity id="7024080">
            <name>ipad</name>
            <type>category</type>
        </entity>
    </entities>
</entity>

i want to create another xml document without the sub-category and subject elements.

so the new one will look like this:

<entity id="1000070">
    <name>apple</name>
    <type>category</type>
    <entities>
        <entity id="7002870">
            <name>mac</name>
            <type>category</type>
        </entity>
        <entity id="7024080">
            <name>iphone</name>
            <type>category</type>
        </entity>
        <entity id="7024080">
            <name>ipad</name>
            <type>category</type>
        </entity>
    </entities>
</entity>

should i use simplexml/php or xslt to do this? are there other ways?

would be great with some code examples...thanks!

+2  A: 

I'd suggest using PHP's DOMDocument class and related classes (if only because I've been using this since forever. Don't really know if simplexml is better or not).

You would do the following:

$doc = new DOMDocument();
$doc->load($xml);
$rootNode = $doc->documentElement;
$entitiesNode = $rootNode->getElementsByTagName('entities')->item(0);
$entityNodes = $entitiesNode->getElementsByTagName('entity');

for($i = 0; $i < $entityNodes->length; $i++)
{
   $entityNode = $entityNodes->item($i);
   $subEntitiesNode = $entityNode->getElementsByTagName('entities');
   if($subEntitiesNode->length)
   {
       $subEntitiesNode->removeChild($subEntitiesNode->item(0));
   }
}

Note that i just wrote that from the top of my hat, so please don't sue if it doesn't work, but it should be reasonably close.
Apart from that, in order to find the nodes to be deleted in a more elegant way, have a look at the PHPs DOMXPath object.

Turbotoast
but maybe there are not childs. they are everywhere. i want to delete ALL elements that got a type of "sub-category" and "subject" regardless where they are located in the xml hierarchy. i will look at xpath..i think thats the solution in my case.
never_had_a_name
Yep, under these conditions, you're propably right, then XPath should be your way to go. :D
Turbotoast
+1  A: 

One way is to define the XPath expression that selects the nodes you want to remove then use DOM to grab each node's parent and remove said node. SimpleXML doesn't have an easy way to do that.

For that kind of complicated manipulations, I use SimpleDOM.

include 'SimpleDOM.php';
$entity = simpledom_load_file('/path/to/your/file.xml');

// either delete all "subject" and "sub-category"
$entity->deleteNodes('//entity[type="subject" or type="sub-category"]');

// or remove everything but "category"
$entity->deleteNodes('//entity[not(type="category")]');

// remove empty <entities/>
$entity->deleteNodes('//entities[count(child::*) = 0]');

echo $entity->asXML();
Josh Davis
it worked great! but how do i select to delete all nodes except type=category AND type=sub-category? ive tried with: '//entity[not(type="category")] | //entity[not(type="sub-category")]' but it didnt work. thanks!
never_had_a_name
If I'm not wrong, that should be //entity[not(type="category" or type="sub-category")]
Josh Davis
+2  A: 

Here some useful functions I use

/* ***** XML MANIPULATION FUNCTIONS ********* */
/**
Adds a new element in a XML list. 
Add $xnew after $x in $docm. 
*/
function XMLadd(DOMDocument $docm, DOMNode $x=null, DOMNode $newx=null, $mode=''){
    if($x!=null && $newx!= null){   
        if($mode === "a_") {
            if($x->nextSibling) {
                $x->parentNode->insertBefore( $docm->importNode($newx, true), $x->nextSibling);
            } else {
                $x->parentNode->appendChild( $docm->importNode($newx, true));
            }
        } else {
            $x->parentNode->insertBefore( $docm->importNode($newx, true), $x);
        }
    }
}

/**
Removes an element from a XML list. 
Remove $x, $x must be DOMNode in a DOMDocument  
*/
function XMLremove(DOMNode $x=null) {
    if($x!=null) {  
        //remove item
        $x->parentNode->removeChild( $x );
    }
}

/**
Replace an element in a XML List.
Parameters: $x(DOMNode) will be replaced by $newx(DOMNode) in $docm (DOMDocument)
*/
function XMLreplace(DOMDocument $docm, DOMNode $x=null, DOMNode $newx=null) {
    if($x!=null && $newx!= null) {  
        //replace = add + remove
        //add new element
        XMLadd($docm, $x, $newx);
        //remove item
        XMLremove($x);
    }
}
+1  A: 

This works for your sample, but it may be a bit too much of a loose cannon.

$doc = new DOMDocument();
$doc->loadXML($xml);
$xpath = new DOMXPath($doc);

foreach ($xpath->query('entities/entity/entities') as $elem) {
    $elem->parentNode->removeChild($elem);
}
chris
but maybe there are not childs. they are everywhere. i want to delete ALL elements that got a type of "sub-category" and "subject" regardless where they are located in the xml hierarchy.
never_had_a_name
Use the answer by Josh Davis then. You don't need to use the 3rd party simpledom, it's easy enough to just use the same xpath selectors and remove nodes in the loop like I did.
chris