tags:

views:

381

answers:

4

one of the toughest challenges I have ever faced in XSLT designing ..

How to copy the unique characters in a given string ..
Test xml is:

<root>
<string>aaeerstrst11232434</string>
</root>

The output I am expecting is:

<string>aerst1234</string>
+3  A: 

Use the following XPath one-liner:

codepoints-to-string(distinct-values(string-to-codepoints(.)))

A complete transformation using this is below:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>       
    <xsl:output method="text"/>

    <xsl:template match="string">
      <xsl:value-of select=
      "codepoints-to-string(distinct-values(string-to-codepoints(.)))
      "/>
    </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the originally provided XML document:

<root>
    <string>aaeerstrst11232434</string>
</root>

the wanted result is produced:

aerst1234

In case an XSLT 1.0 solution is needed -- please, indicate so and I'll provide it.

Dimitre Novatchev
yes I need xslt 1.0, please let me know the solution, if you know ..
infant programmer
Posted as promised. Not only has XSLT recursion, but it is a true functional programming language (even XSLT 1.0). Read about FXSL.
Dimitre Novatchev
+3  A: 

Here is an XSLT 1.0 solution:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:strip-space elements="*"/>
  <xsl:output method="text"/>

  <xsl:template match="string">
    <xsl:call-template name="unique">
      <xsl:with-param name="input" select="."/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unique">
    <xsl:param name="input"/>
    <xsl:param name="output" select="''"/>
    <xsl:variable name="c" select="substring($input, 1, 1)"/>
    <xsl:choose>
      <xsl:when test="not($input)">
        <xsl:value-of select="$output"/>
      </xsl:when>
      <xsl:when test="contains($output, $c)">
        <xsl:call-template name="unique">
          <xsl:with-param name="input" select="substring($input, 2)"/>
          <xsl:with-param name="output" select="$output"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="unique">
          <xsl:with-param name="input" select="substring($input, 2)"/>
          <xsl:with-param name="output" select="concat($output, $c)"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

</xsl:stylesheet>
Martin Honnen
Thank you very much .. Got a new thing to learn .. :-)
infant programmer
I was in false assumption saying .. we cannot use templates as like functions .. But you have proved it wrong .. I am learning (not too late) about a Kind of recursive call-template .. Glad about it .. :-)
infant programmer
+1  A: 

Here is an XSLT 1.0 solution, shorter than the currently selected answer and easier to write as it uses the str-foldl template of FXSL.

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:f="http://fxsl.sf.net/"
 exclude-result-prefixes="f">

 <xsl:import href="str-foldl.xsl"/>
 <xsl:output method="text"/>

 <f:addUnique/>

 <xsl:variable name="vFunAddunique" select=
  "document('')/*/f:addUnique[1]
  "/>

    <xsl:template match="string">
      <xsl:call-template name="str-foldl">
        <xsl:with-param name="pFunc" select="$vFunAddunique"/>
        <xsl:with-param name="pA0" select="''"/>
        <xsl:with-param name="pStr" select="."/>
      </xsl:call-template>
    </xsl:template>

    <xsl:template match="f:addUnique" mode="f:FXSL">
      <xsl:param name="arg1"/>
      <xsl:param name="arg2"/>

      <xsl:value-of select="$arg1"/>
      <xsl:if test="not(contains($arg1, $arg2))">
       <xsl:value-of select="$arg2"/>
      </xsl:if>
    </xsl:template>
</xsl:stylesheet>

When the above transformation is applied to the originally provided source XML document:

<root>
    <string>aaeerstrst11232434</string>
</root>

the wanted result is produced:

aerst1234

Read more about FXSL 1.x (for XSLT 1.0) here, and about FXSL 2.x (for XSLT 2.0) here.

Dimitre Novatchev
ohk .. thanx for the reply .. The solution is working all fine .. :-)
infant programmer
And thanx for the cool links too .. they are really helpful .. :-)
infant programmer
Yup, it was the problem with .Net code, [actually it is a ready-made (and read-only) code written by my seniors, so I never worried to change it] its working all fine now .. thank you :-)
infant programmer
By default, the XslCompiledTransform class disables support for the XSLT document() function and embedded scripting. These features can be enabled by creating an XsltSettings object that has the features enabled and passing it to the Load method. I had to modify the .Net code for using document() function.
infant programmer
It is good, if security is important (and it should be important almost always), in this case to provide your oun XmlResolver, so that it doesn't allow the use of arbitrary URLs, such as absolute filepaths of xml files containing sensitive data.
Dimitre Novatchev
A: 

When I tried with a more complicated XML, then I encountered many problems with Martin Honnen's solution, it doesn't work with the below mentioned XML, so I prepared my own solution refering to Dimitre's this answer And also I could call it a more efficient solution:

Here is an input xml:

    <root>
      <string>aabcdbcd1abcdefghijklmanopqrstuvwxyzabcdefgh0123456789ijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz12312489796453134049446798421230156489413210315487804210313264046040489789789745648974321231564648971232344</string>
      <string2>oejrinsjfojofjweofj24798273492jfakjflsdjljk</string2>
    </root>

And here is the Working XSLT code:

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="text()">
    <xsl:call-template name="unique_chars">
      <xsl:with-param name="input" select="."/>
    </xsl:call-template>
  </xsl:template>

  <xsl:template name="unique_chars">
    <xsl:param name="input"/>
    <xsl:variable name="c">
      <xsl:value-of select="substring($input, 1, 1)"/>
    </xsl:variable>
    <xsl:choose>
      <xsl:when test="not($input)"/>
      <xsl:otherwise>
        <xsl:choose>
          <xsl:when test="contains(substring($input, 2), $c)">
            <xsl:call-template name="unique_chars">
              <xsl:with-param name="input" select="substring($input, 2)"/>
            </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="$c"/>
            <xsl:call-template name="unique_chars">
              <xsl:with-param name="input" select="substring($input, 2)"/>
            </xsl:call-template>
          </xsl:otherwise>
        </xsl:choose>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>
infant programmer
A few questions: 1. What are the reasons you don't just simply use the solution that I posted?2. Why do you treat the space (' ') in a special way?And a statement: When it is known that the number of unique characters is significantly smaller than the total number of characters in the string, then a much more efficient algorithm exists, than the ones shown on this page. I leave finding and implementing this algorithm as an exercise to you :)
Dimitre Novatchev
(1)I use a .Net code to trigger the transformation, it doesn't allow document() function, :-|(2)It is not the target to treat space-char specially, I had simply used it for testing purpose and unknowingly posted the same[edited to avoid confusions] (3) I am happy to do homework, I'll try my best to bring up yet more efficient code:-)
infant programmer
Hmmm... Where do I use the `document()` function? Aside from this, I have been using .NET XslCompiledTransform and XslTransform for many years and never had any problems with the document() function -- unless it is explicitly forbidden by settings and/or a URI-Resolver.
Dimitre Novatchev
@dimitre, Working fine after changing the .Net code. accepted your solution. thanx for pointing out the flaw. :-)
infant programmer