views:

362

answers:

6

I have a RAM of 2 GB. We have a application which performs Export/Import operations. We have a recursive function which has one local variable of type Set which keeps on getting populated every iteration. This Set keeps growing and at one point we run out of memory.

Is there any alternative data structure which can optimally use the memory ?

Here's the rough code

GetObjectsForExportImpl(long lExportOptions, __int64 numIdProject, XExportSets
     &exportSets, long lClientId, CComPtr<IEPIPDServer> ptrIPDServer,FILE *fp)
{
    XExportSets exportLocal;   //Thats a structure containing the Set
    QueryObjectsForExport(lExportOptions, numIdProject, exportLocal,
         lClientId, ptrIPDServer);
    SetIDs::iterator it = exportLocal.setShared.begin();

    for (; it != exportLocal.setShared.end(); ++it)
    {
         //recursive call
         pExportObject->GetObjectsForExportImpl(lExportOptions,
             numIdProject, exportSets, lClientId, ptrIPDServer,fp);
    }
}
+6  A: 

An alternative structure wouldn't help much. Say you switched to a class that uses half the memory. Still you're only delaying the doom.

A structure size of 2GB usually indicates you need to switch to a disk based structure, a database or a memory mapped hashtable.

ssg
Yes I have implemented the disk based algorithm . It did help me a little . But still at some point of time the memory does run out
sameer karjatkar
How can memory run out when the data isn't on it in the first place?
ssg
Sorry for confusing you . I need some data for reference in the memory
sameer karjatkar
Everything can be stored on the disk, including your references.
ssg
+1  A: 

Treat data pieces by pieces rather than all at once.

That is:

while (not end-of-file) {
   data = read_some_information();
   export_some_information(data);
}

Unless you are in a very specific case where you need all the data to be able to do an export (which is highly unlikely)

Julian Aubourg
Thats the problem . The elements in the Set are interdependent so I cant clear them out .
sameer karjatkar
Are the dependencies isolated or really any element may depend on any other element?
sharptooth
What's really important: do you need all the data of previous elements or do you need some sort of accumulated values (like a sum or a mean). If you actually do need to query data from every previous sets, then by all means, fill a database with them then query it to create the export.
Julian Aubourg
A: 

If your recursion is eating up 2GB of memory, you're probably going to have issues with any type of in-memory data structure. Can you post some of your code so we can get a better idea of what you're doing?

One idea might be to use a database table to store the Set as you're building it. You could insert a new record for each recursion, and the record could have a FK reference to the parent record in the table. That way you can walk up and down the recursion by following the parent references in the records.

Andy White
+1  A: 

For a moment, compare your original method call:

GetObjectsForExportImpl(
   long                  lExportOptions, 
   __int64               numIdProject, 
   XExportSets           &exportSets, 
   long                  lClientId, 
   CComPtr<IEPIPDServer> ptrIPDServer,
   FILE                  *fp
   )

to your subsequent recursive call:

hr = pExportObject->GetObjectsForExportImpl(
                         lExportOptions,
                         numIdProject,
                         exportSets,
                         lClientId,
                         ptrIPDServer,
                         fp);

Unless something magic is happening between those, you are simply re-calling the method with its own set of arguments. I suspect that you meant to put in 'exportLocal' instead of 'exportSets' there, because otherwise, what was the point of the exportLocal.setShared.begin()? You'll just keep recreating a new exportLocal, telling it to begin, recursing, etc.

In short, I think the problem is a coding mistake, not a problem with the recursion.

As a side note - can you think of a way to make it a loop, rather than a recursion? Loops are almost always faster, simpler, easier to understand and quick to debug.

Phil H
Actually we are copying the local exportsets i.e the exportlocal to exportSets . I missed one statement inside the for loop .exportSets.setShared.insert(numId);yes I need to think of an iterative way
sameer karjatkar
Would it be worth putting that line into the question? That way we can see the relationship between exportLocal and exportSets.
Phil H
A: 

Sorry - I don't really know C++, but I can possibly help a bit. If you can use subscript notation to access elements, and you have pointers to the parent elements, you can use a stack to do a depth first traversal and avoid the not-insignificant cost of recursion. Here's some C# code that you should be able to translate:

Stack<int> childIndex = new Stack<int>();
childIndex.Push(0);

TreeNode<Folder> workingFolder = GetNodeById(folderId);
TreeNode<Folder> returnFolder = ShallowClone(workingFolder);

while (childIndex.Count > 0) {
    int idx = childIndex.Peek();
    if (idx < workingFolder.Children.Count) {
     visit(workingFolder.Children[idx]);

     // increment count for this level
     childIndex.Pop();
     childIndex.Push(idx + 1);

     // replace current working folders and push new index
     workingFolder = workingFolder.Children[idx];
     returnFolder = f;
     childIndex.Push(0);
    } else {
     // no (more) children
     workingFolder = (workingFolder.Parent == null ? workingFolder : workingFolder.Parent);
     returnFolder = (returnFolder.Parent == null ? returnFolder : returnFolder.Parent);
     childIndex.Pop();
    }
}
Travis
A: 

I don't think the recursion has much to do with your problem. The problem is merely that the data you're creating is big. Moving to an iterative solution won't help with that. As has been pointed out already, write output as you traverse, rather than storing it up in an in-memory structure.

You can, however, minimise what goes on the stack, by passing pointers in your recursive calls, rather than whole structs.

slim
With the disk based approach I was able to export to file of upto 13 GB in memory. But it fails beyond this limit
sameer karjatkar