views:

25

answers:

1

Say I have a source document like this:

<element>
  <subelement xmlns:someprefix="mynamespace"/>
</element>

The xmlns:someprefix is obviously not needed here and doesn't do anything since that namespace is not being used anywhere in the document.

In PHP, after I've loaded this into a DOM tree with DOMDocument->loadXML(), I'd like to be able to detect that such a namespace declaration exists, and remove it.

I know that I can read it with hasAttribute() and even remove it with removeAttributeNS() (strangely) but only if I know its prefix. It doesn't appear in DOMNode->attributes at all, as the thing I'm trying to find is not considered an attribute. I cannot see any way of detecting that it is there without knowing the prefix, other than serialising it back to an XML string and running a regex or something.

How can I do it? Any way to query which namespaces (ie xmlns:something) have been declared in an element?

A: 

How to detect:

<?php
$d = new DOMDocument();
$d->loadXML('
<element>
  <subelement xmlns:someprefix="http://mynamespace/asd"&gt;
  </subelement>
</element>');
$sxe = simplexml_import_dom($d);
$namespaces = $sxe->getDocNamespaces(true);
$x = new DOMXpath($d);
foreach($namespaces as $prefix => $url){
        $count = $x->evaluate("count(//*[namespace-uri()='".$url."' or @*[namespace-uri()='".$url."']])");
        echo $prefix.' ( '.$url.' ): used '.$count.' times'.PHP_EOL;
}

How to remove: pfff, about your only option that I know of is to use xml_parse_into_struct() (as this is not libxml2 reliant afaik), and looping through the resulting array with XML Writer functions, skipping namespace declarations which are not used. Not a fun passtime, so I'll leave the implementation up to you. Another option could be XSL according to this question, but I doubt it is of much use. My best effort seems to succeed, but moves 'top-level'/rootnode namespaces to children, resulting in even more clutter.

edit: this seems to work:

Given XML (added some namespace clutter):

<element xmlns:yetanotherprefix="http://mynamespace/yet"&gt;
  <subelement
        xmlns:someprefix="http://mynamespace/foo"
        xmlns:otherprefix="http://mynamespace/bar"
        foo="bar"
        yetanotherprefix:bax="foz">
        <otherprefix:bar>
                <yetanotherprefix:element/>
                <otherprefix:element/>
        </otherprefix:bar>
        <otherprefix:bar>
                <yetanotherprefix:element/>
                <otherprefix:element/>
        </otherprefix:bar>
        <yetanotherprefix:baz/>
  </subelement>

With xsl (namespaces & not() clause based on previous $used array, so you'll still need that afaik.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
    xmlns:yetanotherprefix="http://mynamespace/yet"
    xmlns:otherprefix="http://mynamespace/bar"&gt; 
    <xsl:template match="/">
        <xsl:apply-templates select="/*"/>
    </xsl:template>
    <xsl:template match="*">
        <xsl:element name="{name(.)}">
                <xsl:apply-templates select="./@*"/>
                <xsl:copy-of select="namespace::*[not(name()='someprefix')]"/>
                <xsl:apply-templates select="./node()"/>
        </xsl:element>
    </xsl:template>

    <xsl:template match="@*">
        <xsl:copy/>
    </xsl:template>
</xsl:stylesheet>

Results in:

<?xml version="1.0"?>
<element xmlns:yetanotherprefix="http://mynamespace/yet"&gt;
  <subelement xmlns:otherprefix="http://mynamespace/bar" foo="bar" yetanotherprefix:bax="foz">
        <otherprefix:bar>
                <yetanotherprefix:element/>
                <otherprefix:element/>
        </otherprefix:bar>
        <otherprefix:bar>
                <yetanotherprefix:element/>
                <otherprefix:element/>
        </otherprefix:bar>
        <yetanotherprefix:baz/>
  </subelement>
</element>
Wrikken
You can remove these namespaces and their associated prefixes using removeAttributeNS() - see the comment from "primaryspace" on this page: http://php.net/manual/en/domelement.removeattributens.php - it's really weird that it's possible but it works! The only part that I was stuck on was detecting the prefix and namespace URL which existed in the first place - but your first example appears to solve this - thanks!
thomasrutter