tags:

views:

326

answers:

6

I recently discovered that a method in a derived class can only access the base class's protected instance members through an instance of the derived class (or one of its subclasses):

class Base
{
    protected virtual void Member() { }
}

class MyDerived : Base
{
    // error CS1540
    void Test(Base b) { b.Member(); }
    // error CS1540
    void Test(YourDerived yd) { yd.Member(); }

    // OK
    void Test(MyDerived md) { md.Member(); }
    // OK
    void Test(MySuperDerived msd) { msd.Member(); }
}

class MySuperDerived : MyDerived { }

class YourDerived : Base { }

I managed to work around this restriction by adding a static method to the base class, since Base's methods are allowed to access Base.Member, and MyDerived can call that static method.

I still don't understand the reason for this limitation, though. I've seen a couple different explanations, but they fail to explain why MyDerived.Test() is still allowed to access MySuperDerived.Member.

The Principled Explanation: 'Protected' means it's only accessible to that class and its subclasses. YourDerived could override Member(), creating a new method that should only be accessible to YourDerived and its subclasses. MyDerived can't call the overridden yd.Member() because it's not a subclass of YourDerived, and it can't call b.Member() because b might actually be an instance of YourDerived.

OK, but then why can MyDerived call msd.Member()? MySuperDerived could override Member(), and that override should only be accessible to MySuperDerived and its subclasses, right?

You don't really know until runtime whether you're calling an overridden member or not. And when the member is a field, it can't be overridden anyway, but access is still forbidden.

The Pragmatic Explanation: Other classes might add invariants that your class doesn't know about, and you must use their public interface so they can maintain those invariants. If MyDerived could directly access protected members of YourDerived, it could break those invariants.

My same objection applies here. MyDerived doesn't know what invariants MySuperDerived might add, either -- it might be defined in a different assembly by a different author -- so why can MyDerived access its protected members directly?

I get the impression that this compile-time limitation exists as a misguided attempt to solve a problem that can really only be solved at runtime. But maybe I'm missing something. Does anyone have an example of a problem that would be caused by letting MyDerived access Base's protected members through a variable of type YourDerived or Base, but does not exist already when accessing them through a variable of type MyDerived or MySuperDerived?

--

UPDATE: I know the compiler is just following the language specification; what I want to know is the purpose of that part of the spec. An ideal answer would be like, "If MyDerived could call YourDerived.Member(), $NIGHTMARE would happen, but that can't happen when calling MySuperDerived.Member() because $ITSALLGOOD."

A: 

"Protected" means exactly that a member is accessible only to the defining class and all subclasses.

As MySuperDerived is a subclass of MyDerived, Member is accessible to MyDerived. Think of it this way: MySuperDerived is a MyDerived and therefore its private and protected members (inherited from MyDerived) are accessible to MyDerived.

However, YourDerived is not a MyDerived and therefore its private and protected members are inaccessible to MyDerived.

And you can't access Member on an instance of Base because Base might be a YourDerived which is not a MyDerived nor a subclass of MyDerived.

And don't do that using static methods to permit access thing. That's defeating the purpose of encapsulation and is a big smell that you haven't designed things properly.

Jason
@Downvoter: Really? What's your reason? This answer isn't helpful? Why? This answer is wrong? Where?
Jason
Isn't `Base` the "defining class" here? That's where `Member` is defined.
Jesse McGrew
Each class that derives from `Base` has a method named `Member` too. Here's a concrete example. If I have a `SpeakingAnimal` with method `Speak` and `Cat : SpeakingAnimal` and `Dog : SpeakingAnimal` they can still `Speak`. It's just that unless they override `Speak` they rely on their parent class for implementation. This is the point of polymorphism: it's saying that one class (a derived class) can be substituted for another (a base class).
Jason
OK, I see what you mean. But this doesn't answer my question: what *potential problem* is prevented by forbidding MyDerived to access YourDerived.Member that *doesn't still exist* when it's allowed to access MySuperDerived.Member?
Jesse McGrew
As for the workaround, I agree that it has a funny code smell, and I've thought about other ways to achieve what I'm trying to do. I may be able to move the relevant code into a protected instance method on the base class and make the other member (a dictionary.. actually a few of them) private, but then I'll need more new methods to add values to the dictionaries, and now the new design doesn't sound much better than the old one. Maybe I'll start a new question for that...
Jesse McGrew
So it sounds like you're asking a philosophical question ("Why is the definition of "protected" what it is?") and not a semantics question? That is, you seem to understand that "protected" means that `MyDerived` can't access `YourDerived`s protected members, you just want to know why. Is that accurate?
Jason
Yes, exactly. I want to know why "protected" is defined the way it is.
Jesse McGrew
A: 

You seem to be thinking about this completely the wrong way.

It's not about "who can call what", it's about what is substitutable where.

Subclasses of MyDerived should always be substitutable for MyDerived (including their overridden protected methods). There is no such constraint on other subclasses of Base, and so you cannot substitute them in place of a MyDerived.

Anon.
I understand that MySuperDerived can be substituted for MyDerived. What I don't understand is why the specification of `protected` allows MyDerived's methods to access Member on all classes that are substitutable for MyDerived, but no others. I don't know what goal the designers thought they were achieving with that.
Jesse McGrew
You understand why class A is *not* substitutable for class B, but not why you can't call A.Method() as though it were B.Method()?
Anon.
I don't understand why "MyDerived can only call protected methods through an expression whose type is substitutable for MyDerived" is a useful way for the `protected` modifier to work.I could understand a definition of `protected` based on visibility ("Base's protected member is only visible in subclasses of Base, and if it's visible you can call it"), or based on actual runtime type ("a method in MyDerived can only call protected members on an instance of MyDerived"), but I don't see the logic behind the way it actually does work.
Jesse McGrew
It *is* based on actual runtime type. If it's a `MyDerived`, you can cast it to one and then call the method. If it's not ... well, you'd better know what you plan on doing if it's not, so you're going to be making that check anyway, right?
Anon.
It is not based on actual runtime type. `MySuperDerived` is not the same as `MyDerived`: it may have new invariants that govern how its internal state should be accessed. Maybe the author of MySuperDerived wants to add a requirement that Member() should only be called after calling some new Prepare() method first. Static type checking can't account for that! But that sort of situation is why the spec forbids us to call YourDerived.Member(), right?
Jesse McGrew
@Jesse McGrew: `MySuperDerived` should not do that. That's a violation of the Liskov Substitution Principle.
Jason
Agreed, it would be bad design. But it's bad design for `YourDerived` to do that also! `Member` was inherited from `Base`, so the constraints on using it should also be inherited from `Base`. Since `Base` can call `Member` on an instance of `YourDerived`, I don't see the justification for not letting `MyDerived` make the same call.
Jesse McGrew
Because `YourDerived` cannot be substituted for `MyDerived`.
Anon.
Once again, that is not an answer to my question. Explaining why it's useful to define "protected" in terms of the callee's class being substituted for the calling class... *that* would be an answer.
Jesse McGrew
A: 

http://msdn.microsoft.com/en-us/library/bcd5672a.aspx

A protected member of a base class is accessible in a derived class only if the access occurs through the derived class type.

There's documentation of the "what?" question. Now I wish I knew "Why?" :)

Clearly virtual has nothing to do with this access restriction.

Hmm, I think you're on to something with the sibling thing... MyDerived shouldn't be able to call YourDerived.Member

If MyDerived can call Base.Member, it might actually be working on an instance of YourDerived and might actually be calling YourDerived.Member

Ah, here is the same question: http://stackoverflow.com/questions/1836175/c-protected-members-accessed-via-base-class-variable/1836932#1836932

David B
+3  A: 

Eric Lippert has explained it well in one of his blog posts.

Hans Passant
I've read that post and it doesn't answer my question. Thanks for reminding me to leave a comment there, though.
Jesse McGrew
Dude, Mr. Lippert is my *father*.
Eric Lippert
Eric, Dude is my *son*. Sorry.
Hans Passant
OK, you got me there.
Eric Lippert
+1  A: 

You're operating on the assumption that this behaviour is explicitly disallowed on account of some vague notion of design purity. I don't work for Microsoft, but I believe that the truth is much simpler: it's not forbidden, it's just not supported, because that would be time-consuming to implement for relatively low impact.

The little-used protected internal will probably cover off the majority of cases where protected alone doesn't quite cut it.

Aaronaught
+1 for actually addressing the "why" instead of "what does the spec say"... but the existence of a whole separate error message/number just for this type of access does suggest that it's specifically forbidden.
Jesse McGrew
No, there are good reasons for this restriction -- which, I note, many other languages share, including C++.
Eric Lippert
Fair enough, I can't argue with one of the designers. As I always say, though: You can pick your friends, and you can pick your nose, but you can't pick your friend's nose. Think about it.
Aaronaught
Also, you can tune a piano, or you can tuna fish, but you can't tune a friend's fish. Right? Hmm. Something seems wrong there.
Eric Lippert
And believe me, *plenty* of people argue with the designers.
Eric Lippert
+8  A: 

Does anyone have an example of a problem that would be caused by letting MyDerived access Base's protected members through a variable of type YourDerived or Base, but does not exist already when accessing them through a variable of type MyDerived or MySuperDerived?

I am rather confused by your question but I am willing to give it a shot.

If I understand it correctly, your question is in two parts. First, what attack mitigation justifies the restriction on calling protected methods through a less-derived type? Second, why does the same justification not motivate preventing calls to protected methods on equally-derived or more-derived types?

The first part is straightforward:

// Good.dll:

public abstract class BankAccount
{
  abstract protected void DoTransfer(BankAccount destinationAccount, User authorizedUser, decimal amount);
}

public abstract class SecureBankAccount : BankAccount
{
  protected readonly int accountNumber;
  public SecureBankAccount(int accountNumber)
  {
    this.accountNumber = accountNumber;
  }
  public void Transfer(BankAccount destinationAccount, User user, decimal amount)
  {
    if (!Authorized(user, accountNumber)) throw something;
    this.DoTransfer(destinationAccount, user, amount);
  }
}

public sealed class SwissBankAccount : SecureBankAccount
{
  public SwissBankAccount(int accountNumber) : base(accountNumber) {}
  override protected void DoTransfer(BankAccount destinationAccount, User authorizedUser, decimal amount) 
  {
    // Code to transfer money from a Swiss bank account here.
    // This code can assume that authorizedUser is authorized.

    // We are guaranteed this because SwissBankAccount is sealed, and
    // all callers must go through public version of Transfer from base
    // class SecureBankAccount.
  }
}

// Evil.exe:

class HostileBankAccount : BankAccount
{
  override protected void Transfer(BankAccount destinationAccount, User authorizedUser, decimal amount)  {  }

  public static void Main()
  {
    User drEvil = new User("Dr. Evil");
    BankAccount yours = new SwissBankAccount(1234567);
    BankAccount mine = new SwissBankAccount(66666666);
    yours.DoTransfer(mine, drEvil, 1000000.00m); // compilation error
    // You don't have the right to access the protected member of
    // SwissBankAccount just because you are in a 
    // type derived from BankAccount. 
  }
}

Dr. Evil's attempt to steal ONE... MILLION... DOLLARS... from your swiss bank account has been foiled by the C# compiler.

Obviously this is a silly example, and obviously, fully-trusted code could do anything it wants to your types -- fully-trusted code can start up a debugger and change the code as its running. Full trust means full trust. Don't actually design a real security system this way!

But my point is simply that the "attack" that is foiled here is someone attempting to do an end-run around the invariants set up by SecureBankAccount, to access the code in SwissBankAccount directly.

That answers your first question, I hope. If that's not clear, let me know.

Your second question is "Why doesn't SecureBankAccount also have this restriction?" In my example, SecureBankAccount says:

    this.DoTransfer(destinationAccount, user, amount);

Clearly "this" is of type SecureBankAccount or something more derived. It could be any value of a more derived type, including a new SwissBankAccount. Couldn't SecureBankAccount be doing an end-run around SwissBankAccount's invariants?

Yes, absolutely! And because of that, the authors of SwissBankAccount are required to understand everything that their base class does! You can't just go deriving from some class willy-nilly and hope for the best! The implementation of your base class is allowed to call the set of protected methods exposed by the base class. If you want to derive from it then you are required to read the documentation for that class, or the code, and understand under what circumstances your protected methods will be called, and write your code accordingly. Derivation is a way of sharing implementation details; if you don't understand the implementation details of the thing you are deriving from then don't derive from it.

And besides, the base class is always written before the derived class. The base class isn't up and changing on you, and presumably you trust the author of the class to not attempt to break you sneakily with a future version. (Of course, a change to a base class can always cause problems; this is yet another version of the brittle base class problem.)

The difference between the two cases is that when you derive from a base class, you have the behaviour of one class of your choice to understand and trust. That is a tractable amount of work. The authors of SwissBankAccount are required to precisely understand what SecureBankAccount guarantees to be invariant before the protected method is called. But they should not have to understand and trust every possible behaviour of every possible cousin class that just happens to be derived from the same base class. Those guys could be implemented by anyone and do anything. You would have no ability whatsoever to understand any of their pre-call invariants, and therefore you would have no ability to successfully write a working protected method. Therefore, we save you that bother and disallow that scenario.

And besides, we have to allow you to call protected methods on receievers of potentially more-derived classes. Suppose we didn't allow that and deduce something absurd. Under what circumstances could a protected method ever be called, if we disallowed calling protected methods on receivers of potentially-more-derived classes? The only time you could ever call a protected method in that world is if you were calling your own protected method from a sealed class! Effectively, protected methods could almost never be called, and the implementation that was called would always be the most derived one. What's the point of "protected" in that case? Your "protected" means the same thing as "private, and can only be called from a sealed class". That would make them rather less useful.

So, the short answer to both your questions is "because if we didn't do that, it would be impossible to use protected methods at all." We restrict calls through less-derivedtypes because if we don't, it's impossible to safely implement any protected method that depends on an invariant. We allow calls through potential subtypes because if we do not allow this, then we don't allow hardly any calls at all.

Does that answer your questions?

Eric Lippert
It does, thanks. You can trust your base classes but you can't trust sibling classes, since you choose the base whereas anyone can write a sibling. It's not really about invariants, it's about security.(One thing I see is that the "good" scenario involves calling a protected method on `this`, and the "evil" scenario involves calling a protected method on another instance -- HostileBankAccount is never even instantiated. Are there *"good"* scenarios that involve calling protected methods on instances other than `this`?)
Jesse McGrew
First off, like I said, be careful when conflating accessibility with security. Inaccessible methods are still callable by fully trusted code; the fully-trusted code just rewrites the metadata until the method is accessible. Or, another way of looking at it is that "security" is just a particular kind of invariant. Really this feature is about *being able to write reliable code*.
Eric Lippert
Second, sure, there are such scenarios. Imagine for example you had a protected method which assisted with rewriting a binary tree. You'd probably need to be able to recursively call it on your left and right sub trees. A red-black tree rewriter had better only be able to call the rewriter on other red-black tree sub-trees.
Eric Lippert
Good question, by the way. I'll use this as a blog post next year.
Eric Lippert
The Blog Post: http://blogs.msdn.com/ericlippert/archive/2010/01/14/why-cant-i-access-a-protected-member-from-a-derived-class-part-six.aspx
Brian