views:

44

answers:

3

Given the following XML:

<table>
  <col width="12pt"/>
  <col width="24pt"/>
  <col width="12pt"/>
  <col width="48pt"/>
</table>

How can I convert the width attributes to numeric values that can be used in mathematical expressions? So far, I have used substring-before to do this. Here is an example template (XSLT 2.0 only) that shows how to sum the values:

<xsl:template match="table">
    <xsl:text>Col sum: </xsl:text>
    <xsl:value-of select="sum(
        for $w 
        in col/@width 
        return number(substring-before($w, 'pt'))
     )"/>
</xsl:template>

Now my questions:

  • Is there a more efficient way to do the conversion than substring-before?
  • What if I don't know the text after the numbers? Any way to do it without using regular expressions?
+2  A: 

This is horrible, but depending on just how much you know about the potetntial set of non-numeric characters, you could strip them with translate():

translate("12jfksjkdfjskdfj", "abcdefghijklmnopqrstuvwxyz", "")

returns

"12"

which you can then pass to number() as currently.

(I said it was horrible. Note that translate() is case sensitive, too)

AakashM
+1  A: 

If you are using XSLT 2.0 is there a reason why you want to avoid using regex?

The most simple solution would probably be to use the replace function with a regex pattern to match on any non-numeric character and replace with empty string.:

replace($w,'[^0-9]','')
Mads Hansen
likely to be working .. +1
infant programmer
@infant programmer - How come you didn't suggest Dimitre's answer to your (similar) question? http://stackoverflow.com/questions/2242725/extracting-number-int-decimal-from-a-string-with-xslt-1-0
Mads Hansen
+1  A: 

I found this answer from Dimitre Novatchev that provides a very clever XPATH solution that doesn't use regex:

translate(., translate(.,'0123456789', ''), '')

It uses the nested translate to strip all the numbers from the string, which yields all other characters, which are used as the values for the wrapping translate function to strip out and return just the number characters.

Applied to your template:

<xsl:template match="table">
    <xsl:text>Col sum: </xsl:text>
    <xsl:value-of select="sum(
        for $w 
        in col/@width 
        return number(translate($w, translate($w,'0123456789', ''), ''))
     )"/>
</xsl:template>
Mads Hansen