I am using the below XSL 2.0 code to find the ids of the text nodes that contains the list of indices that i give as input. the code works perfectly but in terms for performance it is taking a long time for huge files. Even for huge files if the index values are small then the result is quick in few ms. I am using saxon9he Java processor to execute the XSL.
<xsl:variable name="insert-data" as="element(data)*">
<xsl:for-each-group
select="doc($insert-file)/insert-data/data"
group-by="xsd:integer(@index)">
<xsl:sort select="current-grouping-key()"/>
<data
index="{current-grouping-key()}"
text-id="{generate-id(
$main-root/descendant::text()[
sum((preceding::text(), .)/string-length(.)) ge current-grouping-key()
][1]
)}">
<xsl:copy-of select="current-group()/node()"/>
</data>
</xsl:for-each-group>
</xsl:variable>
In the above solution if the index value is too huge say 270962 then the time taken for the XSL to execute is 83427ms. In huge files if the index value is huge say 4605415, 4605431 it takes several minutes to execute. Seems the computation of the variable "insert-data" takes time though it is a global variable and computed only once. Should the XSL be addessed or the processor? How can i improve the performance of the XSL.