views:

972

answers:

3

I have a large 2GB file with 1.5 million listings to process. I am running a console app that performs some string manipulation then uploads each listing to the database.

  1. I created a LINQ object and clear the object by assigning it to a new LinqObject() for each listing (loop).

  2. When the object is complete, I add it to a list.

  3. When the list reaches 100 objects, I submitAll on the entire list, clear the list, then repeat.

My memory usage continues to grow as the program runs. Is there anything I should be doing to keep memory usage down? I tried GC.collect. I think I want to use dispose..

Thanks in advance for looking.

A: 

Do you need your memory usage to stay low? Absent an actual functional problem, high memory usage in and of itself is not an issue.

jasonh
A: 

How large is the memory usage growing? It may be that .NET is just "settling" effectively.

It's not really clear exactly how you're doing this, but the general principle sounds okay. I suggest you take the database work out of the equation - just comment out whichever line would actually submit to the database. See how much memory that uses. Other than the StreamReader (or whatever) you shouldn't have anything else that needs disposing if you're not touching the database - just building batches of transformed objects and throwing them away.

Jon Skeet
Yes, I am using StreamReader. But, am I actually throwing the objects away? I use .clear() on the list after I submit. I declare the linq object once and then just overwrite it for each object by declaring new SomeLinqObject();
Bryan
Thanks for the suggestion. Memory allocation grows to 1GB after 500k listings without submitting to the database.
Bryan
I would actually create a new list rather than clearing the old one, but it should be okay as it is. I'm surprised about the memory allocation then... could you post some sample code?
Jon Skeet
+2  A: 

It's normal for the memory usage of a program to increase when it's working. You should not try to force the garbage collector to reduce the memory usage to try to save resources, this will most likely waste resources instead.

Contrary to one's first reaction, high memory usage is not a performance problem as long as there are any free memory left at all. Having a lot of unused memory doesn't increase the performance a bit. If you try to reduce the memory usage only to keep it down, you are just wasting CPU time doing cleanup that is not needed.

If you are running out of free memory or if some other application needs it, the garbage collector will do the appropriate cleanup. In almost every situation the garbage collector will know much more about the current memory situatiuon than you can possibly anticipate when writing the code.

If you are using objects that implement the IDisposable interface, you should call the Dispose method to free unmanaged resources, but all other objects are handled by the garbage collector. Managed objects normally don't leak memory at all.

Guffa