views:

38

answers:

1

Hi, I have an xml that looks something like this

<para>
   text text text
   <b>text</b> text text <i>text</i>
</para>

the objective is to convert this to mediaWiki formatting with ''' for a bold font and so on.

when I write a transformation for this the template match ignores all the text inside the <para> tag and only the the <b>s and the <i>s are converted. i need help.

update: here is what i have tried so far:

this is what i have tried so far.

<xsl:template match="para">
<xsl:apply-templates select="*"/>
</xsl:template>

<xsl:template match="b">
<xsl:text>'''</xsl:text><xsl:value-of select="replace(replace(.,'\s+$',''),'^\s+','')" disable-output-escaping="no"/><xsl:text>'''</xsl:text>
</xsl:template>

<xsl:template match="i">
<xsl:text>''</xsl:text><xsl:value-of select="replace(replace(.,'\s+$',''),'^\s+','')" disable-output-escaping="no"/><xsl:text>''</xsl:text>
</xsl:template>

this is what I used when i tried the text() function.

<xsl:template match="text()">
<xsl:value-of select="." disable-output-escaping="no"/>
</xsl:template>

--update-- in order to not lose the spaces before and after the text block and the bold and italics flags we can also check for spaces before and after the text.

<xsl:template match="text()">
    <xsl:variable name="originalText" select="."/>
    <xsl:if test="starts-with($originalText,' ')">
        <xsl:text> </xsl:text>
    </xsl:if>
    <xsl:value-of select="normalize-space(.)" disable-output-escaping="no"/>
    <xsl:if test="ends-with($originalText,' ')">
        <xsl:text> </xsl:text>
    </xsl:if>
</xsl:template>
+1  A: 

when I write a transformation for this the template match ignores all the text inside the tag and only the the <b>s and the <i>s are converted. i need help.

update: here is what i have tried so far:

this is what i have tried so far.

<xsl:template match="para">  
  <xsl:apply-templates select="*"/>  
</xsl:template>

Your problem is exactly in this template. Your codes ignores anything else except element - children. the * abbreviation stands for child::element(). Thus the text-nodes children of para aren't processed at all.

Solution:

Simply remove the above template. Then the built-in XSLT template for element nodes will be selected to process the para element. All it does is <xsl:apply-templates/> and this is an abbreviation for <xsl:apply-templates select="child::node()"/>. This applies templates for every child node (of any type, including text nodes). When the text-node children of para are processed, the built-in XSLT template for text node is selected -- all it does is to copy the text as is.

The transformation now becomes:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output method="text"/>

    <xsl:template match="b">
        <xsl:text>'''</xsl:text>
        <xsl:value-of select="replace(replace(.,'\s+$',''),'^\s+','')"/>
        <xsl:text>'''</xsl:text>
    </xsl:template>

    <xsl:template match="i">
        <xsl:text>''</xsl:text>
        <xsl:value-of select="replace(replace(.,'\s+$',''),'^\s+','')" />
        <xsl:text>''</xsl:text>
    </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<para>
   text text text
   <b>text</b> text text <i>text</i>
</para>

the result now includes the text node children of para:

   text text text
   '''text''' text text ''text''
Dimitre Novatchev
thanks this works perfectly.
Prithwin
@Dimitre: +1 Good solution. But I think that there would be a better pattern for those `fn:replace` (same from the question, of course). Maybe `replace(.,'^\s*(.)\s*$','$1')`. Or, if there is not going to be new lines to preserve, just `normalize-space(.)`
Alejandro
@Alejandro: You may be right, I didn't delve deeper in the code -- just explained and solved the *main problem* -- and the OP is happy.
Dimitre Novatchev
@Alejandro: good observation. it might also be a good idea to check for a space before/after `<b>` or `<i>` to account for something like `<b>text </b>text` or `text<i> text</i>`. This has nothing to do with the original question, but it's still fun to think about.
DevNull
thats a great idea guys I used normalize-space(.) and checked for spaces before and after in the match text and the bold and italics tempaltes the transformations are even more accurate now .editing the post with the updated code might be useful for others
Prithwin
@Prithwin: Glad you found this answer great. At SO the gratitude and appreciation are shown by accepting the answer (just click on the check mark left to the answer) :)
Dimitre Novatchev
i did do that earlier but i guess I must have missed the check mark :).. doin it again.
Prithwin