Hi,
I am developing a C# application which needs to process approximately 4,000,000 english sentences. All these sentences are being stored in a tree. Where each node in the tree is a class which has these fields:
class TreeNode
{
protected string word;
protected Dictionary<string, TreeNode> children;
}
My problem is that the application is using up all the RAM (I have 2 GB RAM) when it reaches the 2,000,000th sentence. So it only manages to process half the sentences and then it slows down drastically.
What can I do to try and reduce the memory footprint of the application?
EDIT: Let me explain a bit more my application. So I have approximately 300,000 english sentences, and from each sentence I am generating further sub sentences like this:
Example: Sentence: Football is a very popular sport Sub Sentences I need:
- Football is a very popular sport
- is a very popular sport
- a very popular sport
- very popular sport
- popular sport
- sport
Each sentence is stored in a tree word by word. So considering the example above, i have a TreeNode Class with the word field = "Football", and the children list has the TreeNode for the word "is". The child of the "is" node is the "a" node. The child for the "a" node is the "very" node. I need to store the sentences word by word since i need to be able to search for all the sentences starting with Example: "Football is".
So basically for each word in a sentence i am creating a new (sub-sentence). And this is the reason I ultimately end up with 4,000,000 different sentences. Storing the data in a database is not an option, since the app needs to work on the whole structure at once. And it will further slow down the process if i had to stay writing all the data to a database.
Thanks