views:

155

answers:

4

I have an overload method - the first implementation always returns a single object, the second implementation always returns an enumeration.

I'd like to make the methods generic and overloaded, and restrict the compiler from attempting to bind to the non-enumeration method when the generic type is enumerable...

class Cache
{
    T GetOrAdd<T> (string cachekey, Func<T> fnGetItem)
        where T : {is not IEnumerable}
    {
    }

    T[] GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem)
    {
    }
}

To be used with...

{
    // The compile should choose the 1st overload
    var customer = Cache.GetOrAdd("FirstCustomer", () => context.Customers.First());

    // The compile should choose the 2nd overload
    var customers = Cache.GetOrAdd("AllCustomers", () => context.Customers.ToArray());
}

Is this just plain bad code-smell that I'm infringing on here, or is it possible to disambiguate the above methods so that the compiler will always get the calling code right?

Up votes for anyone who can produce any answer other than "rename one of the methods".

+1  A: 

constraints don't support exclusion, which may seem frustrating at first, but is consistent and makes sense (consider, for example, that interfaces don't dictate what implementations can't do).

That being said, you could play around with the constraints of your IEnumerable overload...maybe change your method to have two generic typings <X, T> with a constraint like "where X : IEnumerable<T>" ?

ETA the following code sample:

  void T[] GetOrAdd<X,T> (string cachekey, Func<X> fnGetItem) 
            where X : IEnumerable<T>
    { 
    }
Steven
+8  A: 

Rename one of the methods. You'll notice that List<T> has an Add and and AddRange method; follow that pattern. Doing something to an item and doing something to a sequence of items are logically different tasks, so make the methods have different names.

Eric Lippert
It appears the OP is designing a caching interface. Having written one already myself, I believe the cache should not care whether the object attempting to be stored is a collection or not; that's just an implementation detail of the serialization process. Only one method should suffice all needs here, in my opinion.
James Dunne
Ok, I admit it. I LOL'd.
MojoFilter
@MojoFilter: You LOL'd at what?
James Dunne
Mark: Up votes for anyone who can produce any answer other than "rename one of the methods".Eric: Rename one of the methods.
MojoFilter
@Mojofilter: Clearly I'm not in it for the upvotes.
Eric Lippert
+2  A: 

Use only one method and have it detect the IEnumerable<T> case dynamically rather than attempting the impossible via generic constraints. It would be "code smell" to have to deal with two different cache methods depending on if the object to store/retrieve is something enumerable or not. Also, just because it implements IEnumerable<T> does not mean it is necessarily a collection.

James Dunne
@James - I think you've pretty much nailed it. After reflecting on the answers posted, I believe what I was attempting is bad code small... if the calling code should be oblovious to the differentiual workings of the method/s, then there should be no differentiation in the method signature... that is... there should be one method which, as you've pointed out, does it's checking internally. Thank you!
Mark
@Mark - I hope that helped. Just trying to prevent you from solving the wrong problem :). Accepted answer?
James Dunne
+5  A: 

This is a difficult use case to support because of how the C# compiler performs overload resolution and how it decides which method to bind to.

The first issue is that constraints are not part of the signature of a method and won't be considered for overload resolution.

The second problem you've got to overcome is that the compiler chooses the best match from the available signatures - which, when dealing with generics, generally means that SomeMethod<T>(T) will be considered a better match than SomeMethod<T>( IEnumerable<T> ) ... particularly when you've got parameters like T[] or List<T>.

But more fundamentally, you have to consider whether operating on a single value vs. a collection of values is really the same operation. If they are logically different, then you probably want to use different names just for clarity. Perhaps there are some use cases where you could argue that the semantic differences between single objects and collections of objects are not meaningful ... but in that case, why implement two different methods at all? It's unclear that method overloading is the best way to express the differences. Let's look at an example that lends to the confusion:

Cache.GetOrAdd("abc", () => context.Customers.Frobble() );

First, note that in the example above we are choosing to ignore the return parameter. Second, notice that we call some method Frobble() on the Customers collection. Now can you tell me which overload of GetOrAdd() will be called? Clearly without knowing the type that Frobble() returns it's not possible. Personally I believe that code whose semantics can't be readily inferred from the syntax should be avoided when possible. If we choose better names, this issue is alleviated:

Cache.Add( "abc", () => context.Customers.Frobble() );
Cache.AddRange( "xyz", () => context.Customers.Frobble() );

Ultimately, there are only three options to disambiguate the methods in your example:

  1. Change the name of one of the methods.
  2. Cast to IEnumerable<T> wherever you call the second overload.
  3. Change the signature of one of the methods in a way that the compiler can differentiate.

Option 1 is self-evident, so I'll say no more about it.

Options 2 is also easy to understand:

var customers = Cache.GetOrAdd("All", 
     () => (IEnumerable<Customer>)context.Customers.ToArray());

Option 3 is more complicated. Let's look at ways we can be achieve it.

On approach is by changing the signature of the Func<> delegate, for instance:

 T GetOrAdd<T> (string cachekey, Func<object,T> fnGetItem)
T[] GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem)

// now we can do:
var customer = Cache.GetOrAdd("First", _ => context.Customers.First());
var customers = Cache.GetOrAdd("All", () => context.Customers.ToArray());

Personally, I find this option terribly ugly, unintuitive, and confusing. Introducing an unused parameter is terrible ... but, sadly it will work.

An alternative way of changing the signature (which is somewhat less terrible) is to make the return value an out parameter:

void GetOrAdd<T> (string cachekey, Func<object,T> fnGetItem, out T);
void GetOrAdd<T> (string cachekey, Func<IEnumerable<T>> fnGetItem, out T[])

// now we can write:
Customer customer;
Cache.GetOrAdd("First", _ => context.Customers.First(), out customer);

Customer[] customers;
var customers = Cache.GetOrAdd("All", 
                               () => context.Customers.ToArray(), out customers);

But is this really better? It prevents us from using these methods as parameters of other method calls. It also makes the code less clear and less understandable, IMO.

A final alternative I'll present is to add another generic parameter to the methods which identifies the type of the return value:

T GetOrAdd<T> (string cachekey, Func<T> fnGetItem);
R[] GetOrAdd<T,R> (string cachekey, Func<IEnumerable<T>> fnGetItem);

// now we can do:
var customer = Cache.GetOrAdd("First", _ => context.Customers.First());
var customers = Cache.GetOrAdd<Customer,Customer>("All", () => context.Customers.ToArray());

So can use hints to help the compiler to choose an overload for us ... sure. But look at all of the extra work we have to do as the developer to get there (not to mention the introduced ugliness and opportunity for mistakes). Is it really worth the effort? Particularly when an easy and reliable technique (naming the methods differently) already exists to help us?

LBushkin
@LBushkin - A brilliant, well explained answer... thank you!
Mark