I have a very large graph stored in a single dimensional array (about 1.1 GB) which I am able to store in memory on my machine which is running Windows XP with 2GB of ram and 2GB of virtual memory. I am able to generate the entire data set in memory, however when I try to serialize it to disk using the BinaryFormatter
, the file size gets to about 50MB and then gives me an out of memory exception. The code I am using to write this is the same I use amongst all of my smaller problems:
StateInformation[] diskReady = GenerateStateGraph();
BinaryFormatter bf = new BinaryFormatter();
using (Stream file = File.OpenWrite(@"C:\temp\states.dat"))
{
bf.Serialize(file, diskReady);
}
The search algorithm is very lightweight, and I am able to perform searches on this graph with no problems once it is in memory.
I really have 3 questions:
Is there a more reliable way to write a large data set to disk. I guess you can define large as when the size of the data set approaches the amount of available memory, though I am not sure how accurate that is.
Should I move to a more database centric approach?
Can anyone point me to some literature on reading portions of a large data set from a disk file in C#?