views:

122

answers:

2

I am trying to isolate the source of a "memory leak" in my C# application. This application copies a large number of potentially large files into records in a database using the image column type in SQL Server. I am using a LinqToSql and associated objects for all database access.

The main loop iterates over a list of files and inserts. After removing much boilerplate and error handling, it looks like this:

foreach (Document doc in ImportDocs) {
    using (var dc = new DocumentClassesDataContext(connection)) {
        byte[] contents = File.ReadAllBytes(doc.FileName);

        DocumentSubmission submission = new DocumentSubmission() {
            Content = contents,
            // other fields
        };

        dc.DocumentSubmissions.InsertOnSubmit(submission);  // (A)
        dc.SubmitChanges();                                 // (B)
    }
}

Running this program over the entire input results in an eventual OutOfMemoryException. CLR Profiler reveals that 99% of the heap consists of large byte[] objects corresponding to the sizes of the files.

If I comment both lines A and B, this leak goes away. If I uncomment only line A, the leak comes back. I don't understand how this is possible, as dc is disposed for every iteration of the loop.

Has anyone encountered this before? I suspect directly calling stored procedures or doing inserts will avoid this leak, but I'd like to understand this before trying something else. What is going on?

Update

Including GC.Collect(); after line (B) appears to make no significant change to any case. This does not surprise me much, as CLR Profiler was showing a good number of GC events without explicitly inducing them.

A: 

Which operating system are you running this on? Your problem may not be related to Linq2Sql, but to how the operating system manages large memory allocations. For instance, Windows Server 2008 is much better at managing large objects in memory than XP. I have had instances where the code working with large files was leaking on XP but was running fine on Win 2008 server.

HTH

unclepaul84
This is occurring on XP Pro.
recursive
A: 

I don't entirely understand why, but making a copy of the iterating variable fixed it. As near as I can tell, LinqToSql was somehow making a copy of the DocumentSubmission inside each Document.

foreach (Document doc in ImportDocs) {
    // make copy of doc that lives inside loop scope
    Document copydoc = new Document() {
        field1 = doc.field1,
        field2 = doc.field2,
        // complete copy
    };

    using (var dc = new DocumentClassesDataContext(connection)) {
        byte[] contents = File.ReadAllBytes(copydoc.FileName);

        DocumentSubmission submission = new DocumentSubmission() {
            Content = contents,
            // other fields
        };

        dc.DocumentSubmissions.InsertOnSubmit(submission);  // (A)
        dc.SubmitChanges();                                 // (B)
    }
}
recursive