views:

923

answers:

4

I have a large number objects which are tree like structures. I have a problem that the amount of memory the application uses starts to approach >1GB which means performance on the machine drops off and that there are out of memory instructions.

i managed to solve this by using sqlite to put the objects to tables and thus effectively manage the data but this is no longer a possible solution (for reasons i wont go into here).

What suggestions do you have for managing this sort of thing ? I have no (real) databases to use so i am thinking that a solution would be to somehow mimick the sqlite procedure and use some file back storage - is there anything that exists in pure dot net already or is it a complicated roll your own situation ?

+3  A: 

Sounds like you need to investigate Memory Mapped Files. I haven't used this CodeProject article but it seems like a good starting point.

Essentially you will be using Win32 to implement a memory mapped file.

Mitch Wheat
+1  A: 

A few obvious question right of the bet: 1. Why do you need to store such large such a large tree structures all at once? 2. Where's this data coming from in the first place?

Whatever the answer you might have for these questions, chances are you don't need to store all of the data in memory all at once. Persists larger amounts of data to the disk and read only the bits you need at the time. It looks like it's the direction you were going anyway. There are many possible places in .NET where you could store data besides RDMS; Flat file, xml file, isolated storage, etc. Or maybe you could have whatever system is feeding you this data (question #2) to give it to smaller bits at a time? If you absolutely need to store large tree structure in memory. Research some tree structure algorithm, or maybe even data compression?

WebMatrix
The data comes from a unqueryable data source - its the entire source element or nothing. The whole structure is kept because the end display is dependent upon the number of results inside the tree.
simonjpascoe
A: 

SQL Server Express is free.

John Saunders
I cannot use this either :(
simonjpascoe
May I ask why not?
John Saunders
i have restrictions on the solution set by a 3rd party (my company :) )
simonjpascoe
+3  A: 

When you start storing a very large number of objects, object allocation overhead becomes a real issue. For example, with .NET running on a 32-bit system, allocating any object requires a minimum of 16 bytes. On a 64-bit system, you're talking 24 bytes minimum per object. If your individual objects are small, that allocation overhead is a huge price to pay.

You said that you have a "tree like structure." Without more detail about your application, I can't say for certain that this applies, but most tree structures require pointers to child nodes and (sometimes) pointers back to the parent nodes. As useful as trees are, they sometimes incur very large overhead. It's not uncommon for the parent and child links to require 50% or more of the total memory used by your tree.

In C#, you can alleviate the allocation overhead by using structs rather than objects, because structs have essentially no allocation overhead. The drawback, of course, is that you have to deal with the sometimes very inconvenient value type semantics.

It's possible, too, to collapse many tree structures into arrays, eliminating child and parent links and thereby saving tremendous amounts of memory. This usually comes at the cost of more complicated code and some loss in runtime efficiency.

In my work, I regularly have to keep very large collections (hundreds of millions of nodes) in memory. When you have 250 million records in memory, every four bytes in your node requires another gigabyte of RAM. Even on a 16-gigabyte machine, maintaining such data structures requires very careful thought about how memory is being used.

If you must keep the entire thing in memory, then I would suggest that you make your tree nodes structs if at all possible. You should also consider alternate ways to store your tree--ways that eliminate explicit links to parents or children. Without more more information about your particular application, I can't make more specific recommendations.

Jim Mischel
in my workflow i've added a new step to persist the data into a dynamically defined DataSet, and forced the GC - so far its working ok. thanks for tips/hints!
simonjpascoe