I'm trying to transform the content of a chm (microsoftcompiled html help) file's index which holds structure information in very arbitrary HTML lists with xsl (see the first code snippet-the actual index file's structure is a bit different, but the important parts are there). I've checked out the index of several chm files but the ul / li tag structures are never the same - only one thing is static: there are param tags which are holding information on chapter/section/whatever titles and links to their html.
Because of this I'm trying to rely only on the depth information of the certain param tags to convert the list into an xml structure (primarly into a docbook structure - see the second code snippet).
<ul>
<li>
<param attr="value" />
<ul>
<li>
<param attr="value" />
<ul>
<li>
<param attr="value" />
</li>
</ul>
</li>
<li>
<param attr="value" />
<ul>
<li>
<param attr="value" />
</li>
<li>
<param attr="value" />
</li>
</ul>
</li>
<li>
<param attr="value" />
<ul>
<li>
<param attr="value" />
</li>
</ul>
</li>
</ul>
</li>
</ul>
I've managed to transform some indexes (similar to previous code snippet) to a docbook structure, but the problem is that my xsl stylesheet is not generic enough. If anyone has an idea to transform a similar html list into a docbook structure using only the depth information of the param tags.
So for example param tags with a depth of x would be transformed to book element, params with a depth of x + 1 would transformed to chapter, etc. - of course always properly nested.
<book>
<title>value1</title>
<chapter>
<title>value2</title>
<section>
<title>value3</title>
</section>
</chapter>
<chapter>
<title>value4</title>
<section>
<title>value5</title>
</section>
<section>
<title>value6</title>
</section>
</chapter>
<chapter>
<title>value7</title>
<section>
<title>value8</title>
</section>
</chapter>
</book>