views:

33

answers:

2

Given a document containing these tags, how can I tell whether " " < "H" < "z" in the document's encoding? I'm trying to do this in XPath 1.0.

<text>H</text>
<range>
  <from>&#x20;</from>
  <to>z</to>
</range>

I might even be able to get away with using contains(), but then how would I create a string containing the characters from " " through "z" to test against?

A: 

I don't think it's possible with XPath 1.0. It is with 2.0, though using the fn:compare function: http://www.w3.org/2005/xpath-functions/#compare

I'm not able to try it out but I guess the XPath would be:

fn:compare(text, range/from) > 0 and fn:compare(text, range/to) < 0
Patrice
A: 

This transformation finds if an ascii character is in the (inclusive) range of any two ascii characters:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output method="text"/>

 <xsl:variable name="vAscii"> !"#$%&amp;'()*+,-./0123456789:;&lt;=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~</xsl:variable>

 <xsl:template match="/*">
  <xsl:call-template name="isInRange">
    <xsl:with-param name="pChar" select="text"/>
    <xsl:with-param name="pStarting" select="range/from"/>
    <xsl:with-param name="pEnding" select="range/to"/>
  </xsl:call-template>
 </xsl:template>

 <xsl:template name="isInRange">
  <xsl:param name="pChar"/>
  <xsl:param name="pStarting"/>
  <xsl:param name="pEnding"/>

  <xsl:value-of select=
   "contains($vAscii, $pChar[1])

   and

    string-length(substring-before($vAscii, $pChar[1]))
   >=
    string-length(substring-before($vAscii, $pStarting))

   and

    string-length(substring-before($vAscii, $pEnding))
   >=
    string-length(substring-before($vAscii, $pChar[1]))

   "/>
 </xsl:template>
</xsl:stylesheet>

when applied on the following XML document (that contains exactly the provided XML fragment):

<t>
    <text>H</text>
    <range>
        <from>&#x20;</from>
        <to>z</to>
    </range>
</t>

produces the wanted result:

true

When applied on this XML document:

<t>
    <text>H</text>
    <range>
        <from>A</from>
        <to>G</to>
    </range>
</t>

again the correct result is produced:

false
Dimitre Novatchev
Hello Dimitre, That certainly does work, but does it scale to unicode? If I build a variable like:<xsl:variable name="vUnicode">x;x;x;...FFFFx;</xsl:variable>will I run into any maxlen constraints?
Telejester
@Telejester: There are no "maxlen constraints". Of course, this variable will take a lot of space... :) Seriously, I thought you needed this just for a fixed subrange of characters -- this solution is exactly for this case. Otherwise in XSLT 1.0 there isn't a suitable way of doing this. You may want to write an extension function that returns the code-point value (integer) for a given character.
Dimitre Novatchev