tags:

views:

101

answers:

3

Hi, I have some XML that looks something like this:

<Root>
    <Documents>
        <Document id="1"/>
    </Documents>
    <People>
        <Person id="1"/>
        <Person id="2"/>
    </People>
    <Links>
        <Link personId="1" documentId="1"/>
        <Link personId="1" documentId="1"/>
        <Link personId="2" documentId="1"/>
    </Links>
</Root>

And I am interested in getting only the 'Link' elements that have a unique combination of 'personId's and 'documentId's, so these two links:

<Root>
    <Links>
        <Link personId="1" documentId="1"/>
        <Link personId="2" documentId="1"/>
    </Links>
</Root>

How might I go about doing that? I have found this question, though I feel mine is slightly more complex and may not apply...I presumme I am going to need to use the key() function somewhere...

Thanks in advance.

+1  A: 

You can combine multiple selector attributes into the XPath query, doesn't have to be just a single attribute=value pair.

http://stackoverflow.com/questions/353843/find-through-multiple-attributes-in-xml

Marc B
A: 

You need to filter the <Link>s with something like this, where the current() function returns <Link>s you're checking for uniqueness.

.[not(preceding-sibling::Link[@personId   = current()/@personId and
                              @documentId = current()/@documentId])]

The preceding-sibling:: axis is used to find earlier <Link> elements and the part in square brackets checks for matching ID numbers. The not() wrapping the whole expression means the entire bracketed expression is true only if NO such preceding sibling matches, i.e. there is no prior <Link> with the same person and document IDs.

My XSLT knowledge is rusty so I'll leave that part to you. What I'm thinking is you first find all links with, say, //Link, and then filter them in a second step with the above XPath. I tried hard but couldn't think of any way to do it all in one step since this relies on the current() function to work.

John Kugelman
+1  A: 

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:key name="kDocAndPeoById" match="Document|Person" use="@id"/>
    <xsl:key name="kLinksByIds" match="Link" 
             use="concat(@personId,'++',@documentId)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="Documents|People|
     Link[count(.|key('kLinksByIds',concat(@personId,'++',@documentId))[1])!=1
          or not(key('kDocAndPeoById',@personId)/self::Person)
          or not(key('kDocAndPeoById',@documentId)/self::Document)]"/>
</xsl:stylesheet>

Output:

<Root>
    <Links>
        <Link personId="1" documentId="1"></Link>
        <Link personId="2" documentId="1"></Link>
    </Links>
</Root>

If you have no interest into checking if there is such Document or Person @id, then this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:key name="kLinksByIds" match="Link" 
              use="concat(@personId,'++',@documentId)"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="Documents|People|
  Link[count(.|key('kLinksByIds',concat(@personId,'++',@documentId))[1])!=1]"/>
</xsl:stylesheet>

Output:

<Root>
    <Links>
        <Link personId="1" documentId="1"></Link>
        <Link personId="2" documentId="1"></Link>
    </Links>
</Root>
Alejandro
Good and elegant. (+1). I came up almost with the same.
Dimitre Novatchev