views:

78

answers:

2

Im transforming my XSLT-stylesheets into documentation, and I want a rich experience wihtin the commentnodes for each code-chunk, therefor I want to convert the following comment and output as xhtml:

String:

# This is a title with __bold__ text and *italic* #
This is just a normal line

- list point with some __bold__
- list point with a "link"[http://www.stackoverflow.com]

Wanted output:

<h1> This is a title with <strong>bold</strong> and <span>italic</span> </h1>
<p>This is a normal line</p>

<ul>
  <li>list point with some <strong>bold</strong></li>
  <li>list point with a <a href="http://www.stackoverflow.com"&gt;link&lt;/a&gt;&lt;/li&gt;
</ul>

I tried with a recursive function that uses xsl:analyze-string recursively from a ruleset, but cant seam to find a solution that works really well.

Anyone have done this lately, or is there some frameworks out there that has functions to do this.

thanx in advance! :)

Edit: Added one dirty example:

<!-- Output comments -->
<xsl:template match="comment()" mode="COMMENT">
    <xsl:copy-of select="ips:groupReplace(normalize-space(.), 
      '
      (.*)(\n|\r)(.*),
      (.*)\*(.*)\*(.*),
      (.*)\*\*(.*)\*\*(.*),
      (.*)__(.*)__(.*), 
      (.*)#(.*)#(.*),
      (.*)-(.*)
      ',
      '
      br,
      span.italic,
      span.bold,
      strong,
      h1,
      li
      ')" />
</xsl:template>

<!-- Initializing the iterateRegex function -->
<xsl:function name="ips:groupReplace">
  <xsl:param name="string" as="xs:string" />
  <xsl:param name="search" />
  <xsl:param name="replace" />
  <xsl:variable name="regex" select="tokenize($search, ',')" />
  <xsl:variable name="replacements" select="tokenize($replace, ',')" />
  <xsl:copy-of select="ips:iterateRegex(count($replacements), $string, $regex, $replacements)" />
</xsl:function>

<!-- Iterate each regex -->
<xsl:function name="ips:iterateRegex">
  <xsl:param name="counter" />
  <xsl:param name="string" />
  <xsl:param name="list_regex" />
  <xsl:param name="list_replace" />
  <xsl:variable name="newStr">
    <xsl:analyze-string select="$string" regex="{normalize-space($list_regex[$counter])}" flags="xm">
      <xsl:matching-substring>
            <xsl:variable name="cc" select="contains($list_replace[$counter], '.')" />
            <xsl:variable name="tag" select="normalize-space(if ($cc) then (substring-before($list_replace[$counter], '.')) else ($list_replace[$counter]))" />
            <xsl:copy-of select="regex-group(1)" />
            <xsl:choose>
              <xsl:when test="normalize-space(regex-group(2)) = ''">
                <xsl:element name="{$tag}" />
              </xsl:when>
              <xsl:otherwise>
                <xsl:element name="{$tag}" >
                  <xsl:if test="$cc">
                    <xsl:attribute name="class" select="substring-after($list_replace[$counter],'.')" />  
                  </xsl:if>
                  <xsl:copy-of select="regex-group(2)" />
                </xsl:element>
              </xsl:otherwise>
            </xsl:choose>
            <xsl:copy-of select="regex-group(3)" />
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:copy-of select="." />
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:variable>
  <xsl:variable name="count" select="number($counter) - 1" />
  <xsl:choose>
    <xsl:when test="$count &gt; 0">
      <xsl:copy-of select="ips:iterateRegex($count, $newStr, $list_regex, $list_replace)" />      
    </xsl:when>
    <xsl:otherwise>
      <xsl:copy-of select="$newStr" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>
+2  A: 

I think you would need a parser. So this stylesheet implements a verbose one:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:template match="text" name="block">
        <xsl:param name="pString" select="."/>
        <xsl:if test="$pString != ''">
            <xsl:choose>
                <xsl:when test="starts-with($pString,'#')">
                    <xsl:call-template name="header">
                        <xsl:with-param name="pString"
                        select="substring($pString,2)"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:when test="starts-with($pString,'&#xA;')">
                    <xsl:call-template name="list">
                        <xsl:with-param name="pString"
                        select="substring($pString,2)"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:call-template name="paragraph">
                        <xsl:with-param name="pString"
                                              select="$pString"/>
                    </xsl:call-template>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:if>
    </xsl:template>
    <xsl:template name="header">
        <xsl:param name="pString"/>
        <xsl:variable name="vInside"
        select="substring-before($pString,'#&#xA;')"/>
        <xsl:choose>
            <xsl:when test="$vInside != ''">
                <h1>
                    <xsl:call-template name="inline">
                        <xsl:with-param name="pString" select="$vInside"/>
                    </xsl:call-template>
                </h1>
                <xsl:call-template name="block">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,'#&#xA;')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:call-template name="paragraph">
                    <xsl:with-param name="pString" 
                                     select="concat('#',$pString)"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="list">
        <xsl:param name="pString"/>
        <xsl:variable name="vCheckList" select="starts-with($pString,'- ')"/>
        <xsl:choose>
            <xsl:when test="$vCheckList">
                <ul>
                    <xsl:call-template name="listItem">
                        <xsl:with-param name="pString" select="$pString"/>
                    </xsl:call-template>
                </ul>
                <xsl:call-template name="block">
                    <xsl:with-param name="pString">
                        <xsl:call-template name="afterlist">
                            <xsl:with-param name="pString" select="$pString"/>
                        </xsl:call-template>
                    </xsl:with-param>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:call-template name="block">
                    <xsl:with-param name="pString" select="$pString"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="paragraph">
        <xsl:param name="pString"/>
        <xsl:choose>
            <xsl:when test="contains($pString,'&#xA;')">
                <p>
                    <xsl:value-of select="substring-before($pString,'&#xA;')"/>
                </p>
            </xsl:when>
            <xsl:otherwise>
                <p>
                    <xsl:value-of select="$pString"/>
                </p>
            </xsl:otherwise>
        </xsl:choose>
        <xsl:call-template name="block">
            <xsl:with-param name="pString"
            select="substring-after($pString,'&#xA;')"/>
        </xsl:call-template>
    </xsl:template>
    <xsl:template name="afterlist">
        <xsl:param name="pString"/>
        <xsl:choose>
            <xsl:when test="starts-with($pString,'- ')">
                <xsl:call-template name="afterlist">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,'&#xA;')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="$pString"/>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="listItem">
        <xsl:param name="pString"/>
        <xsl:if test="starts-with($pString,'- ')">
            <li>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString"
                    select="substring-before(substring($pString,3),'&#xA;')"/>
                </xsl:call-template>
            </li>
            <xsl:call-template name="listItem">
                <xsl:with-param name="pString"
                select="substring-after($pString,'&#xA;')"/>
            </xsl:call-template>
        </xsl:if>
    </xsl:template>
    <xsl:template name="inline">
        <xsl:param name="pString" select="."/>
        <xsl:if test="$pString != ''">
            <xsl:choose>
                <xsl:when test="starts-with($pString,'__')">
                    <xsl:call-template name="strong">
                        <xsl:with-param name="pString"
                        select="substring($pString,3)"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:when test="starts-with($pString,'*')">
                    <xsl:call-template name="span">
                        <xsl:with-param name="pString"
                        select="substring($pString,2)"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:when test="starts-with($pString,'&quot;')">
                    <xsl:call-template name="link">
                        <xsl:with-param name="pString"
                        select="substring($pString,2)"/>
                    </xsl:call-template>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="substring($pString,1,1)"/>
                    <xsl:call-template name="inline">
                        <xsl:with-param name="pString"
                        select="substring($pString,2)"/>
                    </xsl:call-template>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:if>
    </xsl:template>
    <xsl:template name="strong">
        <xsl:param name="pString"/>
        <xsl:variable name="vInside" select="substring-before($pString,'__')"/>
        <xsl:choose>
            <xsl:when test="$vInside != ''">
                <strong>
                    <xsl:value-of select="$vInside"/>
                </strong>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,'__')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="'__'"/>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString" select="$pString"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="span">
        <xsl:param name="pString"/>
        <xsl:variable name="vInside" select="substring-before($pString,'*')"/>
        <xsl:choose>
            <xsl:when test="$vInside != ''">
                <span>
                    <xsl:value-of select="$vInside"/>
                </span>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,'*')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="'*'"/>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString" select="$pString"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="link">
        <xsl:param name="pString"/>
        <xsl:variable name="vInside" 
               select="substring-before($pString,'&quot;')"/>
        <xsl:choose>
            <xsl:when test="$vInside != ''">
                <xsl:call-template name="href">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,'&quot;')"/>
                    <xsl:with-param name="pInside" select="$vInside"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="'&quot;'"/>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString" select="$pString"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
    <xsl:template name="href">
        <xsl:param name="pString"/>
        <xsl:param name="pInside"/>
        <xsl:variable name="vHref"
        select="substring-before(substring($pString,2),']')"/>
        <xsl:choose>
            <xsl:when test="starts-with($pString,'[') and $vHref != ''">
                <a href="{$vHref}">
                    <xsl:value-of select="$pInside"/>
                </a>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString"
                    select="substring-after($pString,']')"/>
                </xsl:call-template>
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="concat('&quot;',$pInside,'&quot;')"/>
                <xsl:call-template name="inline">
                    <xsl:with-param name="pString" select="$pString"/>
                </xsl:call-template>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:template>
</xsl:stylesheet>

With this input:

<text>
# This is a title with __bold__ text and *italic* #
This is just a normal line

- list point with some __bold__
- list point with a "link"[http://www.stackoverflow.com]
</text>

Output:

<h1> This is a title with 
    <strong>bold</strong> text and 
    <span>italic</span>
</h1>
<p>This is just a normal line</p>
<ul>
    <li>list point with some 
        <strong>bold</strong>
    </li>
    <li>list point with a 
        <a href="http://www.stackoverflow.com"&gt;link&lt;/a&gt;
    </li>
</ul>

Note: Look how many templates are simitar (they follow a pattern), so these could be parameterized. I didn't do that in this case because there seems to be more questions wich need some sort of parser, so by the end of the week I will repost an answer implementing fuctional parser and parser combinators pattern that make very easy to write parsers (just writing its grammar rules).

Edit: XSLT 2.0 solution. This stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:template match="text">
        <xsl:param name="pString" select="."/>
        <xsl:analyze-string select="$pString" 
                                        regex="(#(.*)#&#xA;)|((- (.*)&#xA;)+)">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="regex-group(1)">
                        <h1>
                            <xsl:call-template name="inline">
                                <xsl:with-param name="pString" 
                                      select="regex-group(2)"/>
                            </xsl:call-template>
                        </h1>
                    </xsl:when>
                    <xsl:when test="regex-group(3)">
                        <ul>
                            <xsl:call-template name="list">
                                <xsl:with-param name="pString" 
                                      select="regex-group(3)"/>
                            </xsl:call-template>
                        </ul>
                    </xsl:when>
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:if test=".!='&#xA;'">
                    <p>
                        <xsl:call-template name="inline">
                            <xsl:with-param name="pString" 
                                      select="normalize-space(.)"/>
                        </xsl:call-template>
                    </p>
                </xsl:if>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>
    <xsl:template name="list">
        <xsl:param name="pString"/>
        <xsl:analyze-string select="$pString" regex="- (.*)&#xA;">
            <xsl:matching-substring>
                <li>
                    <xsl:call-template name="inline">
                        <xsl:with-param name="pString" 
                                  select="regex-group(1)"/>
                    </xsl:call-template>
                </li>
            </xsl:matching-substring>
        </xsl:analyze-string>
    </xsl:template>
    <xsl:template name="inline">
        <xsl:param name="pString" select="."/>
        <xsl:analyze-string select="$pString" 
                 regex="(__(.*)__)|(\*(.*)\*)|(&quot;(.*)&quot;\[(.*)\])">
            <xsl:matching-substring>
                <xsl:choose>
                    <xsl:when test="regex-group(1)">
                        <strong>
                            <xsl:value-of select="regex-group(2)"/>
                        </strong>
                    </xsl:when>
                    <xsl:when test="regex-group(3)">
                        <span>
                            <xsl:value-of select="regex-group(4)"/>
                        </span>
                    </xsl:when>
                    <xsl:when test="regex-group(5)">
                        <a href="{regex-group(7)}">
                            <xsl:value-of select="regex-group(6)"/>
                        </a>
                    </xsl:when>
                </xsl:choose>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:value-of select="."/>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
    </xsl:template>
</xsl:stylesheet>

Output:

<h1> This is a title with 
    <strong>bold</strong> text and 
    <span>italic</span>
</h1>
<p>This is just a normal line</p>
<ul>
    <li>list point with some 
        <strong>bold</strong>
    </li>
    <li>list point with a 
        <a href="http://www.stackoverflow.com"&gt;link&lt;/a&gt;
    </li>
</ul>
Alejandro
+1 for a complex example (and a lot of work!) in XSLT 1.0, (even though the asker only needs 2.0, which would make it easier and much smaller... ;)
Abel
@Abel: Only a few things could be simplify with XSLT 2.0 . RegExp are not the way to go to solves this problem as a whole, because it needs parser functionality. The big improvement should be to parameterize all similiar templates. But that would be halfway to implement functional parser.
Alejandro
@Alejandro: possibly. But regular expressions wasn't the only improvement in XSLT 2.0... but unless I come up with a solution, I shouldn't try to compare, really.
Abel
@Alejandro, @Abel: This is gigantic effort, however it isn't working with the example from my answer. Also, my solution uses just regular expressions and nothing too-complex.
Dimitre Novatchev
@Alejandro: I see you edited with a new XSLT 2.0 solution, quite some work you put in this one, wished I could +1 you again ;)
Abel
@Dimitre: If there is going to be nested inline markup the only conceptual modification should be to replace `value-of` with recursion call.
Alejandro
+3  A: 

This transformation (111 lines):

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"
 xmlns:my="my:my"
 exclude-result-prefixes="xml xsl xs my">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:variable name="vLines" select="tokenize(., '\n')"/>

  <xsl:sequence select="my:parse-lines($vLines)"/>
 </xsl:template>

 <xsl:function name="my:parse-lines" as="element()*">
  <xsl:param name="pLines" as="xs:string*"/>

  <xsl:sequence select=
       "my:parse-line($pLines, 1, count($pLines))"/>
 </xsl:function>

 <xsl:function name="my:parse-line" as="element()*">
  <xsl:param name="pLines" as="xs:string*"/>
  <xsl:param name="pLineNum" as="xs:integer"/>
  <xsl:param name="pTotalLines" as="xs:integer"/>

  <xsl:if test="not($pLineNum gt $pTotalLines)">
    <xsl:variable name="vLine" select="$pLines[$pLineNum]"/>
    <xsl:variable name="vLineLength"
         select="string-length($vLine)"/>
      <xsl:choose>
       <xsl:when test=
        "starts-with($vLine, '#')
        and
         ends-with($vLine, '#')
        ">
        <xsl:variable name="vInnerString"
         select="substring($vLine, 2, $vLineLength -2)"/>
        <h1>
         <xsl:sequence select="my:parse-string($vInnerString)"/>
        </h1>
        <xsl:sequence select=
        "my:parse-line($pLines, $pLineNum+1, $pTotalLines)"/>
       </xsl:when>
       <xsl:when test=
        "starts-with($vLine, '- ')
       and
         not(starts-with($pLines[$pLineNum -1], '- '))
        ">
        <ul>
          <li>
            <xsl:sequence select="my:parse-string(substring($vLine, 2))"/>
          </li>
          <xsl:sequence select=
           "my:parse-line($pLines, $pLineNum+1, $pTotalLines)"/>
        </ul>
       </xsl:when>
       <xsl:when test="starts-with($vLine, '- ')">
          <li>
            <xsl:sequence select="my:parse-string(substring($vLine, 2))"/>
          </li>
          <xsl:sequence select=
           "my:parse-line($pLines, $pLineNum+1, $pTotalLines)"/>
       </xsl:when>
       <xsl:otherwise>
        <p>
          <xsl:sequence select="my:parse-string($vLine)"/>
        </p>
        <xsl:sequence select=
           "my:parse-line($pLines, $pLineNum+1, $pTotalLines)"/>
       </xsl:otherwise>
      </xsl:choose>
  </xsl:if>
 </xsl:function>

 <xsl:function name="my:parse-string" as="node()*">
  <xsl:param name="pS" as="xs:string"/>

  <xsl:analyze-string select="$pS" flags="x" regex=
  '(__(.*?)__)
  |
   (\*(.*?)\*)
  |
   ("(.*?)"\[(.*?)\])

  '>
   <xsl:matching-substring>
    <xsl:choose>
     <xsl:when test="regex-group(1)">
        <strong>
          <xsl:sequence select="my:parse-string(regex-group(2))"/>
        </strong>
     </xsl:when>
     <xsl:when test="regex-group(3)">
        <span>
          <xsl:sequence select="my:parse-string(regex-group(4))"/>
        </span>
     </xsl:when>
     <xsl:when test="regex-group(5)">
      <a href="{regex-group(7)}">
       <xsl:sequence select="regex-group(6)"/>
      </a>
     </xsl:when>
    </xsl:choose>
   </xsl:matching-substring>

   <xsl:non-matching-substring>
    <xsl:value-of select="."/>
   </xsl:non-matching-substring>
  </xsl:analyze-string>
 </xsl:function>
</xsl:stylesheet>

when applied on this XML document (the provided text complicated with nested constructs and wrapped in an element):

<t># This is a title with __bold__ text and *italic* #
This is just a normal line

- list point with some __bold__
- list point with a __*"link"[http://www.stackoverflow.com]*__&lt;/t&gt;

produces the wanted, correct output:

<h1> This is a title with <strong>bold</strong> text and <span>italic</span> 
</h1>
<p>This is just a normal line</p>
<p/>
<ul>
   <li> list point with some <strong>bold</strong>
   </li>
   <li> list point with a <strong>
         <span>
            <a href="http://www.stackoverflow.com"&gt;link&lt;/a&gt;
         </span>
      </strong>
   </li>
</ul>

Do note: The RegEx mechanism of XPath 2.0 and XSLT 2.0 is adequate for solving this problem.

Dimitre Novatchev
+1 Nice! I knew it shouldn't be too hard to create a basic template!
Abel
Thank you again Dimitre. Ill opensource the documentation-generation on github when its done, so its usuable for other people :)
Sveisvei
@Dimitre: +1 The use of `function` instead of `template`, certainly makes the stylesheet more compact. But, I think that your `parse-line` and `parse-string` (as well as my own `block` and `inline`) function as grammar productions (so, parsing) wich is the basic concept missing in OP attempt.
Alejandro
@Alejandro: I don't think this should be called "parsing" I can easily substitute the line-by-line processing with a little bit more RegExes. We would need a parser if it was allowed and required to have *the same tag* nest into itself with unlimited level of depth.
Dimitre Novatchev
@Dimitre: I'm not talking about line-by-line processing, but about `parse-line` and `parse-string` relationship with grammar productions. Do you think you could easily modify the answer with only one fuction instead?
Alejandro
@Alejandro: WHat determines whether you have a lexer or a parser is not the number of functions you use, or whether a finite automaton can or cannot be used. In the latter case one has to use stack (which is potentially infinitely deep). THis isn't the case with the current problem.
Dimitre Novatchev