views:

142

answers:

3

I am looking for hints on how to debugging a crash in an application that uses the MS XML wrappers in the Delphi VCL. I suspect memory corruption, or some kind of obscure evil thing happening between objects and interfaces, such as reference counting bugs, or heap corruption. The question is, in effect: how do I debug such a crash?

This particular code makes heavy internal use of and extends on the base XmlIntf Interfaces (IXMLNode). ISomethingCustom is an interface that extends IXMLNode. THe problem happens where we crash somewhere in a recursive function that is passed an ISomethingCustom which is also (or supports also, in interface terms) IXMLNode.

   boolean UtilityFunction( aNode: ISomethingCustom ):Boolean;
   begin
      if not Assigned(aNode) then exit; // this works. great.
      if not Assigned(aNode.ParentNode) then exit; // this DOES NOT WORK.
     // code that blows up if aNode.ParentNode is not assigned.
   end;

The situation is that the aNode is also IXMLNode, and IXMLNode.ParentNode value is assigned (not nil), and yet it points to a COM object that may have been freed, destroyed, or corrupted somehow. I am trying to figure out WHAT is going on when an interface pointer can appear to be valid, but the object behind it has been nuked somehow.

Checking Assigned(aNode.ParentNode) returns TRUE, even when, if you were to attempt a cast in the debugger (at runtime only, not in the code), like this:

  1. inspect/evaluate aNode
  2. inspect/evaluate TInterfacedObject(aNode).ClassName (works in Delphi 2010, at least!)
  3. now cast TWhateverClassNameYouGotBefore(aNode).
  4. In the debugger I now see that this is NIL. WHich may mean that the magic "casting interface back to the object" feature that is new in delphi 2010, is failing.

I believe I am trying to debug a problem where heaps are corrupted, or COM objects are corrupt on the heap, because of a reference counting problem.

I really think that nobody should ever have the situation arise where an interface appears valid, but the object underneath has been deleted. I really would like to know what to do, and what is going on.

A: 

Wild guess: Have you tried to put aNode.ParentNode in a local variable and use it in the rest of the Utilityfunction:

   function UtilityFunction(aNode: ISomethingCustom): Boolean;
   var
     lParentNode: INode;
   begin
      if not Assigned(aNode) then exit; // this works. great.
      lParentNode := aNode.ParentNode;
      if not Assigned(lParentNode) then exit;
     // code that uses lParentNode.
   end;
François
It appears to be a valid non-nil interface pointer, but when cast back to an object like TXMLNode(aNode), you get nil. And sometimes, this works at debug time but not in the code (the code doesn't see the nil, but the debugger does). Hairy.
Warren P
Most Delphi versions do not allow you to type-cast an interface pointer back to an object pointer. That is a newer feature only available in Delphi 2010 or XE, I forget which introduced it. Before then, the only way to get an object pointer from an interface pointer is to have the interface implement a method that returns the implementing class's Self pointer.
Remy Lebeau - TeamB
I just learned about that. We are not intentionally doing any such cast except that I was casting about in the debugger and discovered that you CAN cast back to the class, if you find out what class it is first, in the debugger expression evaluater.
Warren P
+5  A: 

Although you haven't shown it in your code, your comments seem to indicate that you're type-casting the interface variable to a class type. That's not allowed. I've described why:

Interface references and object references don't point to the same things. Therefore, calling a method on one when the compiler thinks you have the other will yield unexpected results. You were unlucky because the code continued to run instead of crashing with an access violation, which would have been a bigger indication that you were doing something wrong.

My article above concludes by suggesting you use the JclSysUtils​.GetImplementorOfInterface function from the JCL if you have a Delphi-implemented interface and the interface offers no function of its own for revealling the underlying object.

Rob Kennedy
Type-casting an interface pointer back to its implementing class pointer is a new feature in Delphi 2010 or XE, I forget which one introduced it.
Remy Lebeau - TeamB
I was only casting back to the Object because I noticed that such a cast resulted in a NIL value, and found that curious. AT runtime, I do not actually do such a cast, except as debug code which I quickly removed after discovering what you say. It still crashes, which means I have either heap corruption, or something else going wrong.
Warren P
I have rewritten my question almost totally, I hope that it makes the nature of my situation more clear. Your "pointers" above made something clear to me that I didn't understand before, which is great. Given my complete misapprehension of the situation, I wonder if this question can be rescued, or if I should rewrite it again when I know what is going on more.
Warren P
A: 

My suggestion is to make sure that the ParentNode function is actually called in Assigned(aNode.ParentNode). There are some nasty corner-cases in Delphi where a procedure/function without arguments doesn't get called, but rather it's reference is taken when you omit the parenthesis's.

Try to change it to Assigned(Anode.ParentNode()) (which should have the same effect as François suggestion).

Kaos
I don't think this is it.
Warren P