tags:

views:

2934

answers:

6

I'm using C# and I get an out System.OutOfMemoryException error after I read in 50,000 records, what is best practice for handling such large datasets? Will paging help?

Thanks

A: 

If you're using xml, just read a few nodes at at time. If you're using some other format, just read a few lines (or whatever) at a time. Don't load the entire thing into memory before you start working on it.

Esteban Araya
A: 

Hi, I need to read all the data so it can be written to a mdb file which hasn't been created at the time the data is read in. Should I cache the data locally?

Thanks

+2  A: 

You still shouldn't read everything in at once. Read in chunks, then write the chunk out to the mdb file, then read another chunk and add that to the file. Reading in 50,000 records at once is just asking for trouble.

Matthew Scharley
+1  A: 

Obviously, you can't read all the data in the memory before creating the MDB file, otherwise you wouldn't be getting out of memory exception. :-)

You have two options: - partitioning - read the data in smaller chunks using filtering - virtualizing - split the data in pages and load only the current page

In any case, you have to create the MDB file and transfer the data after that in chunks.

Franci Penov
+3  A: 

I might recommend creating the MDB file and using a DataReader to stream the records into the MDB rather than trying to read in and cache the entire set of data locally. With a DataReader, the process is more manual, but you only get one record at a time so you won't fill up your memory.

Travis Illig
A: 

I would suggest using a generator:

"...instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator."

The wikipedia article also has few good examples

Richard Nienaber