tags:

views:

498

answers:

3

Hi folks,

I am currently trying to load a slightly large xml file into a dataset. The xml file is about 700 MB and every time I try to read the xml it needs plenty of time and after a while it throws an "out of memory" exception.

DataSet ds = new DataSet();
ds.ReadXml(pathtofile);

The main problem is, that it is necessary for me to use those datasets (I use it to import the data from xml file into a sybase database (foreach table, foreach row, foreach column)) and that I have no scheme file.

I already googled a while, but I did only find solutions that won't be usable for me.

Additional information: I use a Sybase (ASA 9) database, but my C# application crashes before I handle the db. The error occures after I read the XML into the dataset and want to work with the ds. I already read that this is a known error when using datasets with large content. I need the data in a dataset at least once, because I need to import it into the db.

A: 

We'll need a little more than that, I think. What programs are you using? What database? Does C# crash or the database? Or your browser?

Main solution would be to give the part that's throwing the out of memory exception(I guess that's your C# application) more memory with a parameter. At least that's what I would do if it was a Java program.

rbottel
I added additionial information :)
A: 

You need to find a way to 'lazily' read the XML file instead of bringing it all into memory at once.

this kb article shows how to read an XML file element by element http://support.microsoft.com/kb/307548

I would suggest taking that example and modifying it to perform your task.

luke
+2  A: 

You may be able to get past this using an overload of the ReadXml method. Pass in a buffered stream instead, and see if this speeds things up for you.

Here is code:

DataSet ds = new DataSet();
FileStream filestream = File.OpenRead(pathtofile);
BufferedStream buffered = new BufferedStream(filestream);
ds.ReadXml(buffered);

With the size of the data you are talking about, the dataset itself may get memory constrained. Part of the problem with XML is that it can take 500kb of data and turn it into 500 MB simply by poor choice of element name and nesting depth. Since you are lacking a schema, you may be able to short circuit the memory constraint by reading the file like above, and simply replace the element names with shorter versions (e.g. Replace <Version></Version> with <V></V> for a reduction in bytes of >60%).

Good luck, and I hope this helps!

Audie
Will test it this evening, but sounds good!