views:

488

answers:

4

I was recently given the following piece of code as a sort-of puzzle to help understand Polymorphism and Inheritance in OOP - C#.

// No compiling!
public class A
{
     public virtual string GetName()
     {
          return "A";
     }
 }

 public class B:A
 {
     public override string GetName()
     {
         return "B";
     }
 }

 public class C:B
 {
     public new string GetName()
     {
         return "C";
     }
 }

 void Main()
 {
     A instance = new C();
     Console.WriteLine(instance.GetName());
 }
 // No compiling!

Now, after a long, long chat with the other developer who presented the puzzle, I know what the output is, but I wont spoil it for you. The only issue I'm really having is how we get to that output, how the code steps through, what's inheriting what, etc.

I thought C would be returned as that seems to be the class that is defined. Then I went through my head as to whether B would be returned because C inherits B - but B also inherits A (which is where I got confused!).


Question:

Could anyone explain how polymorphism and inheritance play their part in retrieving the output, eventually displayed on screen?

+17  A: 

It should return "B" because B.GetName() is held in the little virtual table box for the A.GetName() function. C.GetName() is a compile time "override", it doesn't override the virtual table so you can't retrieve it through a pointer to A.

Blindy
Well said, and certainly a lot shorter than the answer I was typing.
Brian Rasmussen
I think using the phrase "compile time 'override'" for `C.GetName()` is a bit misleading, as it's not overriding anything. Almost any other word would be more useful :)
Jon Skeet
Hence the quotes :)
Blindy
given a class D which inherits from c and overrides GetName once again... what would than the outcome be given the class instantiated in main is D?
Toad
I'd just call it a 'compile time replacement'.
Joren
`D` would give a compile time error, I think. Since it derives from `C`, `C`'s `new string GetName()` would be in scope, and it isn't virtual.
Joren
I'll keep my original expression to not make Jon Skeet's comment look weird.
Blindy
joren: ok so a class d which inherits from c and makes it virtual again, and then a class E which inherits from D and overrides it yet again. How we instantiate E and assign it to A... what now? It does seem weird one could add multiple layers of vtables.
Toad
There is only one vtable, but a `new` method is just a different method which happens to have the same name. The compiler picks the vtable slot to use depending on the static type of the receiver. `((C)foo).GetName()` will always call `C.GetName` or an override of `C.GetName` if it were virtual, and `((A)foo).GetName()` will always call `A.GetName` or one of its overrides. Here `C.GetName` and its overrides are not overrides of `A.GetName`, they're different methods.
Joren
thanks for clearing that up!
Toad
Short, but sweet answer - perfect!
Daniel May
+1  A: 

Actually, I think it should display C, because new operator just hides all ancestor methods with the same name. So, with methods of A and B hidden, only C remains visible.

http://msdn.microsoft.com/en-us/library/51y09td4%28VS.71%29.aspx#vclrfnew%5Fnewmodifier

FractalizeR
Yes but it's still a compile-time thing. If all you have is a reference to `A`, the compiler doesn't know that your method might be hidden, all it has is its virtual table to work with, and the only thing that overwrites the virtual table slot is the `override` keyword.
Blindy
Now if you cast your `A` pointer back down to `C`, you will get a different result.
Blindy
Yea, that's really interesting.
FractalizeR
+1  A: 

Easy, you only have to keep the inheritance tree in mind.

In your code, you hold a reference to a class of type 'A', which is instantiated by an instance of type 'C'. Now, to resolve the exact method address for the virtual 'GetName()' method, the compiler goes up the inheritance hierarchy and looks for the most recent override (note that only 'virtual' is an override, 'new' is something completely different...).

That's in short what happens. The new keyword from type 'C' would only play a role if you would call it on an instance of type 'C' and the compiler then would negate all possible inheritance relations altogether. Strictly spoken, this has nothing to do at all with polymorphism - you can see that from the fact that whether you mask a virtual or non-virtual method with the 'new' keyword doesn't make any difference...

'New' in class 'C' means exactly that: If you call 'GetName()' on an instance of this (exact) type, then forget everything and use THIS method. 'Virtual' in contrary means: Go up the inheritance tree until you find a method with this name, no matter what the exact type of the calling instance is.

Thomas Weller
+21  A: 

The correct way to think about this is to imagine that every class requires its objects to have a certain number of "slots"; those slots are filled with methods. The question "what method actually gets called?" requires you to figure out two things:

  1. What are the contents of each slot?
  2. Which slot is called?

Let's start by considering the slots. There are two slots. All instances of A are required to have a slot we'll call GetNameSlotA. All instances of C are required to have a slot we'll call GetNameSlotC. That's what the "new" means on the declaration in C -- it means "I want a new slot". Compared to the "override" on the declaration in B, which means "I do not want a new slot, I want to re-use GetNameSlotA".

Of course, C inherits from A, so C must also have a slot GetNameSlotA. Therefore, instances of C have two slots -- GetNameSlotA, and GetNameSlotC. Instances of A or B which are not C have one slot, GetNameSlotA.

Now, what goes into those two slots when you create a new C? There are three methods, which we'll call GetNameA, GetNameB, and GetNameC.

The declaration of A says "put GetNameA in GetNameSlotA". A is a superclass of C, so A's rule applies to C.

The declaration of B says "put GetNameB in GetNameSlotA". B is a superclass of C, so B's rule applies to instances of C. Now we have a conflict between A and B. B is the more derived type, so it wins -- B's rule overrides A's rule. Hence the word "override" in the declaration.

The declaration of C says "put GetNameC in GetNameSlotC".

Therefore, your new C will have two slots. GetNameSlotA will contain GetNameB and GetNameSlotC will contain GetNameC.

We've now determined what methods are in what slots, so we've answered our first question.

Now we have to answer the second question. What slot is called?

Think about it like you're the compiler. You have a variable. All you know about it is that it is of type A. You're asked to resolve a method call on that variable. You look at the slots available on an A, and the only slot you can find that matches is GetNameSlotA. You don't know about GetNameSlotC, because you only have a variable of type A; why would you look for slots that only apply to C?

Therefore this is a call to whatever is in GetNameSlotA. We've already determined that at runtime, GetNameB will be in that slot. Therefore, this is a call to GetNameB.

The key takeaway here is that in C# overload resolution chooses a slot and generates a call to whatever happens to be in that slot.

Eric Lippert
Excellent answer. There were a few others that did well but this one really explains the problem at hand. Well explained.
Daniel May