I am very new to XSLT, and the first thing that i need to do is parse a 300MB file (and that's on the small end). The XSLT is not that complex for the moment, it's just removing some nodes that match a certain criteria. I have two problems:
- It's too slow. It takes 50 seconds to process 500,000 records and that's not fast enough.
- It consumes 500MBs of memory, so this will only get worse when the files will get bigger.
Is there anything i can do natively in .net to make is perform better?
I know I can look into SAX based parsing, or STX (which is mentioned in another post), but I would prefer to stay within the .net boundaries.
Thank you!
EDIT: Here's my XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:test="http://schemas....">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="test:QueryRow[test:Columns/test:QueryColumn[test:Name='hit_count' and test:Value>200]]"/>
</xsl:stylesheet>
Here's the code i use to do the transform
XslCompiledTransform compiledTransform = new XslCompiledTransform();
XsltSettings settings = new XsltSettings();
settings.EnableScript = true;
XmlReader xmlReader = XmlReader.Create("in.xml");
XmlWriter xmlWriter = XmlWriter.Create("out.xml");
compiledTransform.Load("format.xslt", settings, null);
compiledTransform.Transform(xmlReader, xmlWriter); //this is what takes a long time
At the moment I am trying to just read the file in, and write it back out, but it seems to actually be reading the whole file into memory, so I am trying to find a way to read it line by line.