tags:

views:

5382

answers:

7

I have an XPath expression which provides me a sequence of values like the one below:

1 2 2 3 4 5 5 6 7

It is easy to convert this to a set of unique values "1 2 3 4 5 6 7" using the distinct-values function. However, what I want to extract is the list of duplicate values = "2 5". I can't think of an easy way to do this. Can anyone help?

A: 

Calculate the difference between your original set and the set of distinct values. This is the set of numbers that occur more than once. Note that numbers in this result set are not necessarily distinct if they occur more than twice in the original sequence so convert again to a set of distinct values if this is required.

GerG
A: 

Yes, but the problem is how do I calculate the difference between two sequences ? You can compare sequences using the union / intersect / except keywords, but none of these will provide the 'difference' between the 2 sets of values.

http://www.dpawson.co.uk/xsl/sect2/muench.html#d10875e108 shows the set difference techniques.DaveP
DaveP
+2  A: 

What about:

distinct-values(
  for $item in $seq
  return if (count($seq[. eq $item]) > 1)
         then $item
         else ())

This iterates through the items in the sequence, and returns the item if the number of items in the sequence that are equal to that item is greater than one. You then have to use distinct-values() to remove the duplicates from that list.

JeniT
Hi Jeni,Seems there is a simpler solution :) $vSeq[index-of($vSeq,.)[2]]Cheers,Dimitre
Dimitre Novatchev
A: 

What about xsl? Is it applicable to your request?

 <xsl:for-each select="/r/a">
  <xsl:variable name="cur" select="." />
  <xsl:if test="count(./preceding-sibling::a[. = $cur]) > 0 and count(./following-sibling::a[. = $cur]) = 0">
   <xsl:value-of select="." />
  </xsl:if>
 </xsl:for-each>
michal kralik
A: 

Thanks to all who answered, JeniT gave just the kind of solution I was looking for - thanks !

What about the one-line solution I posted two days ago? Seems you do not log in too frequently. Hint: At least you could accept one of the answers
Dimitre Novatchev
Hint: You can *accept* one of the proposed solutions.
Dimitre Novatchev
A: 

Given the following xml:

<a>
    <b>1</b>
    <b>2</b>
    <b>2</b>
    <b>3</b>
    <b>4</b>
    <b>5</b>
    <b>5</b>
    <b>5</b>
    <b>6</b>
    <b>7</b>
</a>

The following XPath will give you a list of repeating values (in this case 2, 5, 5)

/a/b[.=following-sibling::b]

However if you wanted a distinct list of repeating values (in this case 2, 5) then the following XPath should do the business for you:

/a/b[.=following-sibling::b][not(.=preceding-sibling::b)]
Wilfred Knievel
How this works: the stuff in the first square brackets returns a list of the nodes that repeat (2,5,5) it’s worth noting that these values are kind of pointers to the values in the original list. The second square brackets work in the opposite direction on the main list to return only unique results
Wilfred Knievel
This question was asked for a *sequence* of items, not for a node-set. Your solution on the other side works for node-sets only. Also, it is not too efficient. As a first step, using /a/b[.=following-sibling::b][1] may be more efficient. Cheers
Dimitre Novatchev
+4  A: 

Use this simple XPath 2.0 expression:

      $vSeq[index-of($vSeq,.)[2]]

where $vSeq is the sequence of values in which we want to find the duplicates.

For explanation of how this "works", see:

      http://dnovatchev.spaces.live.com/Blog/cns!44B0A32C2CCF7488!904.entry

Cheers,

Dimitre Novatchev

Dimitre Novatchev
Very nice. I constantly overlook the index-of() function.
JeniT