Hi I have an xml file with about 500 mb and i'm using LINQ with c# to query that file, but it's very slow, because it loads everything into memory. Is there anyway that i can query that file without loading all into memory?
Thanks
Hi I have an xml file with about 500 mb and i'm using LINQ with c# to query that file, but it's very slow, because it loads everything into memory. Is there anyway that i can query that file without loading all into memory?
Thanks
Hi.
No, its not possible when using Linq. Linq loads a model of the full xml into memory so you can have access using the tree structure.
If you want fast access without loading the file into memory you could use XmlReader class.
This class gives you a fast forward-only xml parser that has only the current node in memory.
Here is some help on that: http://support.microsoft.com/kb/307548
Edit: Sorry, didn't know that its possible to combine xmlreader with linq.
You can use the technique described on MSDN's page about XNode.ReadFrom
to generate an IEnumerable of XNode
s (in the example they provide, XElement
s) from an XmlReader
.
Note that when you read an XElement
from a Stream
or XmlReader
, the entire contents of that element must be read too - so you'll still need a little bit of custom logic in the IEnumerator logic to ensure that the right XElements get returned - for instance, if you return the root element, you might as well just parse the entire document right away since the root element contains almost everthing anyhow. The XNode.ReadFrom
example contains such logic too.
This article should get you up and running. Take a look at the SimpleStreamAxis
method, which is very handy for finding nodes in large XML files. I've successfully used a variant of this method on 5GB XML files without loading the file into memory.
Thanks for the help i'll see what's the best way from those put here and i'll put here the results.