tags:

views:

212

answers:

2

i need to read (and parse) large spreadsheet files (20-50MB) using the openxml libraries and there doesn't seem to be a way to stream the rows one at a time for parsing.

i'm consistently getting Out Of Memory exceptions as it seems as soon as i attempt to access a row (or iterate) the entire row contents are loaded (100K+ rows).

each of the calls, whether Elements.Where( with query ) or Descendants ( ) seem to load the entire rowset

is there a way to stream or just read a row at a time ?

thx

A: 

do the openxml libraries use dom or sax models? with dom you usually have to hold the entire document in memory at once, but with sax you can stream the events as they come.

darren
A: 

i found an answer. if you use the OpenXmlReader on the worksheet part you can iterate through and effectively lazy load the elements you come across.

OpenXmlReader oxr = OpenXmlReader.Create(worksheetPart); 

look for

ElementType == typeof(SheetData) 

and load the row (lazy)

Row row = (Row)oxr.LoadCurrentElement();
Craig