First off, let me just say that Jon's answer is correct. This is one of the hairiest parts of the spec, so good on Jon for diving into it head first.
Second, let me say that this line:
An implicit conversion exists from a method group to a compatible delegate type
(emphasis added) is deeply misleading and unfortunate. I'll have a talk with Mads about getting the word "compatible" removed here.
The reason this is misleading and unfortunate is because it looks like this is calling out to section 15.2, "Delegate compatibility". Section 15.2 described the compatibility relationship between methods and delegate types, but this is a question of convertibility of method groups and delegate types, which is different.
Now that we've got that out of the way, we can walk through section 6.6 of the spec and see what we get.
To do overload resolution we need to first determine which overloads are applicable candidates. A candidate is applicable if all the arguments are implicitly convertible to the formal parameter types. Consider this simplified version of your program:
class Program
{
delegate void D1();
delegate string D2();
static string X() { return null; }
static void Y(D1 d1) {}
static void Y(D2 d2) {}
static void Main()
{
Y(X);
}
}
So let's go through it line by line.
An implicit conversion exists from a method group to a compatible delegate type.
I've already discussed how the word "compatible" is unfortunate here. Moving on. We are wondering when doing overload resolution on Y(X), does method group X convert to D1? Does it convert to D2?
Given a delegate type D and an
expression E that is classified as a
method group, an implicit conversion
exists from E to D if E contains at
least one method that is applicable [...] to an
argument list constructed by use of
the parameter types and modifiers of
D, as described in the following.
So far so good. X might contain a method that is applicable with the argument lists of D1 or D2.
The compile-time application of a conversion from a method group E to a delegate type D is described in the following.
This line really doesn't say anything interesting.
Note that the existence of an implicit conversion from E to D does not guarantee that the compile-time application of the conversion will succeed without error.
This line is fascinating. It means that there are implicit conversions which exist, but which are subject to being turned into errors! This is a bizarre rule of C#. To digress a moment, here's an example:
void Q(Expression<Func<string>> f){}
string M(int x) { ... }
...
int y = 123;
Q(()=>M(y++));
An increment operation is illegal in an expression tree. However, the lambda is still convertible to the expression tree type, even though if the conversion is ever used, it is an error! The principle here is that we might want to change the rules of what can go in an expression tree later; changing those rules should not change the type system rules. We want to force you to make your programs unambiguous now, so that when we change the rules for expression trees in the future to make them better, we don't introduce breaking changes in overload resolution.
Anyway, this is another example of this sort of bizarre rule. A conversion can exist for the purposes of overload resolution, but be an error to actually use. Though in fact, that is not exactly the situation we are in here.
Moving on:
A single method M is selected corresponding to a method invocation of the form E(A) [...] The argument list A is a list of expressions, each classified as a variable [...] of the corresponding parameter in the formal-parameter-list of D.
OK. So we do overload resolution on X with respect to D1. The formal parameter list of D1 is empty, so we do overload resolution on X() and joy, we find a method "string X()" that works. Similarly, the formal parameter list of D2 is empty. Again, we find that "string X()" is a method that works here too.
The principle here is that determining method group convertibility requires selecting a method from a method group using overload resolution, and overload resolution does not consider return types.
If the algorithm [...] produces an error, then a compile-time error occurs. Otherwise the algorithm produces a single best method M having the same number of parameters as D and the conversion is considered to exist.
There is only one method in the method group X, so it must be the best. We've successfully proven that a conversion exists from X to D1 and from X to D2.
Now, is this line relevant?
The selected method M must be compatible with the delegate type D, or otherwise, a compile-time error occurs.
Actually, no, not in this program. We never get as far as activating this line. Because, remember, what we're doing here is trying to do overload resolution on Y(X). We have two candidates Y(D1) and Y(D2). Both are applicable. Which is better? Nowhere in the specification do we describe betterness between these two possible conversions.
Now, one could certainly argue that a valid conversion is better than one that produces an error. That would then effectively be saying, in this case, that overload resolution DOES consider return types, which is something we want to avoid. The question then is which principle is better: (1) maintain the invariant that overload resolution does not consider return types, or (2) try to pick a conversion we know will work over one we know will not?
This is a judgment call. With lambdas, we do consider the return type in these sorts of conversions, in section 7.4.3.3:
E is an anonymous function, T1 and T2
are delegate types or expression tree
types with identical parameter lists,
an inferred return type X exists for E
in the context of that parameter list,
and one of the following holds:
T1 has a return type Y1, and T2 has a return type Y2, and the conversion
from X to Y1 is better than the
conversion from X to Y2
T1 has a return type Y, and T2 is void returning
It is unfortunate that method group conversions and lambda conversions are inconsistent in this respect. However, I can live with it.
Anyway, we have no "betterness" rule to determine which conversion is better, X to D1 or X to D2. Therefore we give an ambiguity error on the resolution of Y(X).