views:

90

answers:

4

While testing application performance, I came across some pretty strange GC behavior. In short, the GC runs even on an empty program without runtime allocations!

The following application demonstrates the issue:

using System;
using System.Collections.Generic;

public class Program
{
    // Preallocate strings to avoid runtime allocations.
    static readonly List<string> Integers = new List<string>();
    static int StartingCollections0, StartingCollections1, StartingCollections2;

    static Program()
    {
        for (int i = 0; i < 1000000; i++)
            Integers.Add(i.ToString());
        GC.Collect();
        GC.WaitForPendingFinalizers();
        GC.Collect();
    }

    static void Main(string[] args)
    {
        DateTime start = DateTime.Now;
        int i = 0;
        Console.WriteLine("Test 1");
        StartingCollections0 = GC.CollectionCount(0);
        StartingCollections1 = GC.CollectionCount(1);
        StartingCollections2 = GC.CollectionCount(2);
        while (true)
        {
            if (++i >= Integers.Count)
            {
                Console.WriteLine();
                break;
            }

            // 1st test - no collections!
            {
                if (i % 50000 == 0)
                {
                    PrintCollections();
                    Console.Write(" - ");
                    Console.WriteLine(Integers[i]);
                    //System.Threading.Thread.Sleep(100);
                    // or a busy wait (run in debug mode)
                    for (int j = 0; j < 50000000; j++)
                    { }
                }
            }
        }

        i = 0;
        Console.WriteLine("Test 2");
        StartingCollections0 = GC.CollectionCount(0);
        StartingCollections1 = GC.CollectionCount(1);
        StartingCollections2 = GC.CollectionCount(2);
        while (true)
        {
            if (++i >= Integers.Count)
            {
                Console.WriteLine("Press any key to continue...");
                Console.ReadKey(true);
                return;
            }

            DateTime now = DateTime.Now;
            TimeSpan span = now.Subtract(start);
            double seconds = span.TotalSeconds;

            // 2nd test - several collections
            if (seconds >= 0.1)
            {
                PrintCollections();
                Console.Write(" - ");
                Console.WriteLine(Integers[i]);
                start = now;
            }
        }
    }

    static void PrintCollections()
    {
        Console.Write(Integers[GC.CollectionCount(0) - StartingCollections0]);
        Console.Write("|");
        Console.Write(Integers[GC.CollectionCount(1) - StartingCollections1]);
        Console.Write("|");
        Console.Write(Integers[GC.CollectionCount(2) - StartingCollections2]);
    }
}

Can someone explain what is going on here? I was under the impression that the GC won't run unless memory pressure hits specific limits. However, it seems to run (and collect) all the time - is this normal?

Edit: I have modified the program to avoid all runtime allocations.

Edit 2: Ok, new iteration and it seems that DateTime is the culprit. One of the DateTime methods allocates memory (probably Subtract), which causes the GC to run. The first test now causes absolutely no collections - as expected - while the second causes several.

In short, the GC only runs when it needs to run - I was just generating memory pressure unwittingly (DateTime is a struct and I thought it wouldn't generate garbage).

+4  A: 

GC.CollectionCount(0) returns the following:

The number of times garbage collection has occurred for the specified generation since the process was started.

Therefore you should see an increase in the numbers and that increase doesn't mean that memory is leaking but that the GC has run.

Also in the first case you can see this increase. It simply will happen much slower because the very slow Console.WriteLine method is called much more often, slowing things down a lot.

stapeluberlauf
That was my initial surprise: why would the GC run without a memory leak? Seems DateTime was leaking memory after all, which triggered the GC collections.
The Fiddler
+1  A: 

If you just wait a few seconds, you see that the collection count also increases in the first test, but not as fast.

The differences between the codes is that the first test writes out the collection count all the time, as fast as it can, while the second test loops without writing anything out until the time limit is reached.

The first test spends most of the time waiting for text being written to the console, while the second test spends most of the time looping, waiting for the time limit. The second test will do a lot more iterations during the same time.

I counted the iterations, and printed out the number of iterations per garbage collection. On my computer the first test stabilises around 45000 iterations per GC, while the second test stabilises around 130000 iterations per GC.

So, the first test actually does more garbage collections than the second test, about three times as many.

Guffa
Thanks, this is the answer that lead me to the correct direction: why would the first case behave like this? The answer is that it calls DateTime.Subtract around three times more often than the second case - hence thrice the memory leak!
The Fiddler
@The Fiddler: It's actually not memory leaks, it's just memory throughput. Managed objects doesn't leak memory.
Guffa
Where "memory leak" read "unexpected memory allocation". It is not a memory leak in the usual sense of the word (since the memory is reclaimed) but the term makes sense for code that must not cause GCs when run (e.g. a game were a gen-0 collection may cause dropped frames). The question here was referring to such code.
The Fiddler
+3  A: 

Another thing that should be noted here is that GC.Collect() is not a synchronous function call. It triggers a garbage collection, but that garbage collection occurs on a background thread, and theoretically may not have finished running by the time you get around to checking your GC statistics.

There is a GC.WaitForPendingFinalizers call which you can make after GC.Collect to block until the garbage collection occurs.

If you really want to attempt to accurately track GC statistics in different situations, I would instead utilize the Windows Performance Monitor on your process, where you can create monitors on all sorts of things, including .NET Heap statistics.

Nick
The second edit uses GC.WaitForPendingFinalizers().
The Fiddler
A: 

Thanks everyone! Your suggestions helped reveal the culprit: DateTime is allocating heap memory.

The GC does not run all the time but only when memory is allocated. If memory usage is flat, the GC will never run and GC.CollectionCount(0) will always return 0, as expected.

The latest iteration of the test showcases this behavior. The first test run does not allocate any heap memory (GC.CollectionCount(0) remains 0), while the second allocates memory in a non-obvious fashion: through DateTime.Subtract() -> Timespan.

Now, both DateTime and Timespan are value types, which is why I found this behavior surprising. Still, there you have it: there was a memory leak after all.

The Fiddler