tags:

views:

329

answers:

4

I have a xml document like this :

<Node1 attrib1="abc">
    <node1_1>
         <node1_1_1 attrib2 = "xyz" />
    </ node1_1>
</Node1>

<Node2 />    

Here <node2 /> is the node i want to remove since it has not children/elements nor any attributes.

+1  A: 

Smething like this should do it:

XmlNodeList nodes = xmlDocument.GetElementsByTagName("Node1");

foreach(XmlNode node in nodes)
{
    if(node.ChildNodes.Count == 0)
         node.RemoveAll;
    else
    {
        foreach (XmlNode n in node)
        {
            if(n.InnerText==String.Empty && n.Attributes.Count == 0)
            {
                n.RemoveAll;

            }
        }
    }
}
TheGeekYouNeed
The node names I mentioned are just to explain what I want. They are not the real node names. I want to do something generic. I believe XPath will be useful here, but i dont know how to use XPath. I am reading about it :). Thanks for the reply though.
mishal153
+2  A: 

Using an XPath expression it is possible to find all nodes that have no attributes or children. These can then be removed from the xml. As Sani points out, you might have to do this recursively because node_1_1 becomes empty if you remove its inner node.

var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(
@"<Node1 attrib1=""abc"">
        <node1_1>
             <node1_1_1 />
        </node1_1>
    </Node1>
    ");

// select all nodes without attributes and without children
var nodes = xmlDocument.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0]");

Console.WriteLine("Found {0} empty nodes", nodes.Count);

// now remove matched nodes from their parent
foreach(XmlNode node in nodes)
    node.ParentNode.RemoveChild(node);

Console.WriteLine(xmlDocument.OuterXml);
Console.ReadLine();
Thomas
Thanks, this is working fine for me :)
mishal153
Just want to add one more thing. I realize that I also need to cover the situation where a node is like <node1> hello </node1>. Here the node has no child and no attributes but it has text, and so i do not want it to be filtered and removed. So the correct solution for me was : XmlNodeList list = document.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0 and not(text())]");
mishal153
You could simplify that XPATH expression by using `node()` to combine the tests for `*` and `text()` and using a union `|` to merge tests for attributes and nodes for criteria of the count: `//*[count(child::node() | @*) = 0]`
Mads Hansen
+1  A: 

This should work if parent node should also be removed when all child nodes are removed:

static void Main(string[] args)
{
  const string strXml = "<Node1 attrib1=\"abc\"><node1_1><node1_1_1 /></node1_1></Node1>";
  var doc = new XmlDocument();
  doc.LoadXml(strXml);

  RemoveEmptyNodes(doc);
}

public static bool RemoveEmptyNodes(XmlNode node)
{
  if (node.HasChildNodes) {
    foreach (XmlNode child in node.ChildNodes) {
      var doDelete = RemoveEmptyNodes(child);
      if (doDelete)
      node.RemoveChild(child);
    }
  }

  if (node.HasChildNodes || node.Attributes.Count > 0) return false;

  return true;
}
Sani Huttunen
Thanks Sani, I have checked and this also works fine. Thanks.
mishal153
A: 

This stylesheet uses an identity transform with an empty template matching elements without nodes or attributes, which will prevent them from being copied to the output:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <!--Identity transform copies all items by default -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--Empty template to match on elements without attributes or child nodes to prevent it from being copied to output -->
    <xsl:template match="*[not(child::node() | @*)]"/>

</xsl:stylesheet>
Mads Hansen