views:

147

answers:

2

I recently started using the ANTS profiling tools for production work. Aside from being amazed by their awesomeness, I couldn't help but wonder how they work. For example, one of the most useful features lets you visualize the global roots of a running program complete with the number of references to values of different types.

How does this tool get hold of that information?

+10  A: 

(Full disclosure: I'm on the Visual Studio Profiler team, but the below information is public)

You can do this by writing a CLR profiler that runs inside the process you're targeting. CLR profilers are C++ COM objects that get instantiated by the runtime when the COR_PROFILER and COR_PROFILING_ENABLED environment variables are set (see here). There are two main CLR profiling interfaces, specifically, ICorProfilerCallback and ICorProfilerInfo. ICorProfilerCallback is what the CLR uses to notify you about specific events that you subscribe to (module loads, function JIT compliation, thread creation, GC events), while ICorProfilerInfo can be used by your profiler to obtain additional information about threads, modules, types, methods, and metadata for the loaded assemblies. This interface is what you could use to obtain symbol information about the types allocated.

With your profiler in-process, you can force a GC through ICorProfilerInfo::ForceGC. After the GC completes, your profiler will get notified via ICorProfilerCallback2::GarbageCollectionFinished, and you will get the root references via ICorProfilerCallback2::RootReferences2. When you combine the root reference information with ICorProfilerCallback::ObjectReferences, you can get the complete object reference graph for your .NET application.

You can get more realtime information by using the ICorProfilerCallback::ObjectAllocated callback to determine when individual CLR objects get created. This can be expensive, though, since you're incurring at least an additional function call for each allocated object. You can track individual objects by mapping the CLR-assigned ObjectID to your own internal ID. An ObjectID for a given object is an ephemeral pointer since it can change as garbage collections happen, which can cause objects to move during compaction. This process is illustrated here. You can use the information from ICorProfilerCallback::MovedReferences to track moving objects.

In order to activate the callbacks mentioned above, you need to tell the CLR profiling API that you're interested in them. You can do this by specifying COR_PRF_MONITOR_GC and COR_PRF_MONITOR_OBJECT_ALLOCATED as part of your event flags when calling ICorProfilingInfo::SetEventMask.

David Broman is the developer on the CLR profiler, and his blog has tons of great information on profiling in general, including all the crazy pitfalls and issues you might run into.

Chris Schmich
+1  A: 

Profilers like ANTS use an "profiling API" presented by the CLR itself, that quite simply can tell you what goes on inside the CLR. For instance there is an API callback-method that occur when an object is allocated, aptly named ObjectAllocated(). Likewise there are events for when methods are entered, when threads are created, etc etc.

The original profiling API is called ICorProfilerCallback. Later versions are called CoreProfilerCallback2 and CoreProfilerCallback3. If you google those names you'll find exactly the answers you're looking for. On codeproject you can see a practical example: Creating a Custom .NET Profiler

A final note: The API cannot be used from managed code like C# and VB.NET. It's only available from unmanaged code like e.g. C or C++. So a C# app cannot use this API to examine its own behavior and objects, for instance.

Richard Flamsholt