ansaurus

Question

Is Serialization reliable for object size estimation?

Answer 1

+2 A:

The serialized form of data is not the same as in-memory; for example, a collection/dictionary will involve multiple objects for the items, the arrays, hash-buckets/indexes, etc - but the raw data (when serialized) will typically be just the data - so you might see less volume when serialized.

Equally, things like BinaryFormatter have to include a lot of (verbose) type metadata - but in the objects it just has a (terse) type handle in the object handle - so you might see more data in the serialized data. Likewise, the serializer (unless it is manually optimized) needs to tokenize the individual fields - but in memory this is implicit in the offset from the objects address.

So you might get a number from serialization, but it is not the same number.

To get an accurate idea of the size of an object graph is tricky. SOS might help; otherwise, create a whole shed-load of them and divide. Crude, but it might just work.

Marc Gravell 2009-04-17 12:45:35

Answer 2

A:

Here's a function I've used to estimate the memory cost of a managed type. It presumes that objects are allocated sequentially in memory (and not from the large-object heap) so it will not give an accurate result for objects that allocate huge arrays for example. It also does not absolutely guarantee that GC won't corrupt the answer but it makes that very unlikely.

/// <summary>
/// Gets the memory cost of a reference type.
/// </summary>
/// <param name="type">The type for which to get the cost. It must have a
/// public parameterless constructor.</param>
/// <returns>The number of bytes occupied by a default-constructed
/// instance of the reference type, including any sub-objects it creates
/// during construction. Returns -1 if the type does not have a public
/// parameterless constructor.</returns>
public static int MemoryCost(Type type)
{
  // Make garbage collection very unlikely during the execution of this function
  GC.Collect();
  GC.WaitForPendingFinalizers();

  // Get the constructor and invoke it once to run JIT and any initialization code
  ConstructorInfo constr = type.GetConstructor(Type.EmptyTypes);
  if (constr == null)
    return -1;
  object inst1 = constr.Invoke(null); // 

  int size;
  unsafe
  {
    // Create marker arrays and an instance of the type
    int[] a1 = new int[1];
    int[] a2 = new int[1];
    object inst2 = constr.Invoke(null);
    int[] a3 = new int[1];

    // Compute the size by determining how much was allocated in between
    // the marker arrays.
    fixed (int* p1 = a1)
    {
      fixed (int* p2 = a2)
      {
        fixed (int* p3 = a3)
        {
          size = (int)(((long)p3 - (long)p2) - ((long)p2 - (long)p1));
        }
      }
    }
  }
  return size;
}

JayMcClellan 2009-04-17 15:16:31

Way too many assumptions here. Why not use a memory profiler?

Marek 2010-09-14 08:01:34

ansaurus

tags:

views:

answers:

Is Serialization reliable for object size estimation?

related questions