tags:

views:

448

answers:

3

I want to use XPath to get a list of the names of all the elements that appear in an XML file. However, I don't want any names repeated, so an element with the same name as a preceding element should not be matched. So far, I've got:

*[not(local-name() = local-name(preceding::*))]

This executes alright but it spits out duplicates. Why does it spit out the duplicates and how can I eliminate them? (I'm using Firefox's XPath engine.)

+1  A: 

I would recommend first getting a list of all elements and then iterate through the list and add them to a dictionary to detect duplicates.

For example, in pseudo-code:

var allElements = doc.select("//node()");
var distinctElementTypes = new object();
foreach (var elem in allElements) {
    distinctElementTypes[elem.name] = elem.name;
}

And now distinctElementTypes will be a dictionary of distinct element names.

Eilon
Thanks for the reply. I could take this approach but xpath only requires one line of code. Also, this is exactly the type of problem xpath is designed for. I'd like to know what is wrong with the example given, as I want to further my xpath education. As far as I can tell, it should work, but it doesn't.
mawrya
I'm not sure why 'preceding' isn't working. Could it be that it compares only to the preceding sibling nodes of the node in question as opposed to *all* preceding nodes?
Eilon
That would be the preceding-sibling:: axis.W3C says:the preceding axis contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes.So, I can see why I might get duplicates if elements with the same name are ancestors, but I'm getting duplicates from sibling elements too!
mawrya
+2  A: 

You are getting duplicates because your filter is not evaluating the way you think it is.

The local-name() function returns the local name of the first node in the nodeset.

The only time your predicate filter would work is if the element happened to have the same name as the first preceding element.

I don't think you will be able to accomplish what you want with a pure XPATH 1.0 soultion. You could do it in XPATH 2.0, but that would not work with Firefox.

In XSLT you can use the meunchien method to accomplish what you want.

Below is an example. You didn't provide any sample XML, so I kept it very generic(e.g. //* matches for all elements in the doc):

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output method="xml"/>
<xsl:key name="names" match="//*" use="local-name(.)"/>
<xsl:template match="/">
    <xsl:for-each select="//*[generate-id(.) = generate-id(key('names', local-name(.)))]">
        <!--Do something with the unique list of elements-->
    </xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Mads Hansen
Thanks for the clue. You are right about the local-name() returning the first node. I actually meant to mark this as the accepted answer but clicked the wrong check mark. However, I did end up doing the filtering in javascript, so both these answers are part of my solution. Thanks.
mawrya
A: 

Valid in XPath 2.0:

distinct-values(//*/name())

Patrick