tags:

views:

29

answers:

1

I'm trying to transform the content of a chm (microsoftcompiled html help) file's index which holds structure information in very arbitrary HTML lists with xsl (see the first code snippet-the actual index file's structure is a bit different, but the important parts are there). I've checked out the index of several chm files but the ul / li tag structures are never the same - only one thing is static: there are param tags which are holding information on chapter/section/whatever titles and links to their html.

Because of this I'm trying to rely only on the depth information of the certain param tags to convert the list into an xml structure (primarly into a docbook structure - see the second code snippet).

    <ul>
        <li>
            <param attr="value" />
            <ul>
                <li>
                    <param attr="value" />
                    <ul>
                        <li>
                            <param attr="value" />
                        </li>
                    </ul>
                </li>
                <li>
                    <param attr="value" />
                    <ul>
                        <li>
                            <param attr="value" />
                        </li>
                        <li>
                            <param attr="value" />
                        </li>
                    </ul>
                </li>
                <li>
                    <param attr="value" />
                    <ul>
                        <li>
                            <param attr="value" />
                        </li>
                    </ul>
                </li>
            </ul>
        </li>
    </ul>

I've managed to transform some indexes (similar to previous code snippet) to a docbook structure, but the problem is that my xsl stylesheet is not generic enough. If anyone has an idea to transform a similar html list into a docbook structure using only the depth information of the param tags.

So for example param tags with a depth of x would be transformed to book element, params with a depth of x + 1 would transformed to chapter, etc. - of course always properly nested.

    <book>
        <title>value1</title>
        <chapter>
            <title>value2</title>
            <section>
                <title>value3</title>
            </section>
        </chapter>
        <chapter>
            <title>value4</title>
            <section>
                <title>value5</title>
            </section>
            <section>
                <title>value6</title>
            </section>
        </chapter>
        <chapter>
            <title>value7</title>
            <section>
                <title>value8</title>
            </section>
        </chapter>
    </book>
+2  A: 

If your main problem is how to produce a different element depending on the depth in the input tree, then the following stylesheet demonstrates one way of doing it: first, figuring out the nesting level by count()ing elements on the ancestor-or-self axis, and then using xsl:element to create an element with a dynamically determined name.

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <xsl:template match="li">
    <xsl:variable name="depth" select="count(ancestor-or-self::li)"/>
    <xsl:variable name="tag">
      <xsl:choose>
        <xsl:when test="$depth = 1">book</xsl:when>
        <xsl:when test="$depth = 2">chapter</xsl:when>
        <xsl:otherwise>section</xsl:otherwise>
      </xsl:choose>
    </xsl:variable>
    <xsl:element name="{$tag}">
      <xsl:apply-templates/>
    </xsl:element>
  </xsl:template>

  <xsl:template match="param">
    <title>
      <xsl:value-of select="@attr"/>
    </title>
  </xsl:template>

</xsl:stylesheet>

Edit. And here's another way of doing the same. This uses different templates for matching the source element at different depths. This may be a bit easier to read, as it eliminates the need of creating a dynamically named element.

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <xsl:template match="li[count(ancestor-or-self::li) = 1]">
    <book>
      <xsl:apply-templates/>
    </book>
  </xsl:template>

  <xsl:template match="li[count(ancestor-or-self::li) = 2]">
    <chapter>
      <xsl:apply-templates/>
    </chapter>
  </xsl:template>

  <xsl:template match="li[count(ancestor-or-self::li) &gt; 2]">
    <section>
      <xsl:apply-templates/>
    </section>
  </xsl:template>

  <xsl:template match="param">
    <title>
      <xsl:value-of select="@attr"/>
    </title>
  </xsl:template>

</xsl:stylesheet>
Jukka Matilainen
I've just tried out your stylesheets, both of them are working like a charm in most cases. My only problem left is that there some other chm indexes which could not be transformed relying on depth information because a specific depth holds chapter and section information too. Nonetheless, thank you very much for your help, your answer was exhaustive! You pointed out some handy xsl tricks for me as well. :)
Psycho_Dad
@Jukka Matilainen: +1 for pattern matching solution
Alejandro