views:

194

answers:

6

EDIT:

From the answers given, it's been made rather clear to me how the design I'm asking about below should actually be implemented. With those suggestions in mind (and in response to a comment politely pointing out that my example code does not even compile), I've edited the following code to reflect what the general consensus seems to be. The question that remains may no longer make sense in light of the code, but I'm leaving it as it is for posterity.


Suppose I have three overloads of a function, one taking IEnumerable<T>, one taking ICollection<T>, and one taking IList<T>, something like the following:

public static T GetMiddle<T>(IEnumerable<T> values) {
    IList<T> list = values as IList<T>;
    if (list != null) return GetMiddle(list);

    int count = GetCount<T>(values);

    T middle = default(T);
    int index = 0;

    foreach (T value in values) {
        if (index++ >= count / 2) {
            middle = value;
            break;
        }
    }

    return middle;
}

private static T GetMiddle<T>(IList<T> values) {
    int middleIndex = values.Count / 2;
    return values[middleIndex];
}

private static int GetCount<T>(IEnumerable<T> values) {
    // if values is actually an ICollection<T> (e.g., List<T>),
    // we can get the count quite cheaply
    ICollection<T> genericCollection = values as ICollection<T>;
    if (genericCollection != null) return genericCollection.Count;

    // same for ICollection (e.g., Queue<T>, Stack<T>)
    ICollection collection = values as ICollection;
    if (collection != null) return collection.Count;

    // otherwise, we've got to count values ourselves
    int count = 0;
    foreach (T value in values) count++;

    return count;
}

The idea here is that, if I've got an IList<T>, that makes my job easiest; on the other hand, I can still do the job with an ICollection<T> or even an IEnumerable<T>; the implementation for those interfaces just isn't as efficient.

I wasn't sure if this would even work (if the runtime would be able to choose an overload based on the parameter passed), but I've tested it and it seems to.

My question is: is there a problem with this approach that I haven't thought of? Alternately, is this in fact a good approach, but there's a better way of accomplishing it (maybe by attempting to cast the values argument up to an IList<T> first and running the more efficient overload if the cast works)? I'm just interested to know others' thoughts.

A: 

Usually when designing interfaces you want to accept a 'lowest common denominator' type for the arguments. For return types it is a matter of some debate. I generally think creating the above overloads is overkill. It's biggest problem is the introduction of unneeded code-paths that now must be tested. Better to have one method that performs the operation one way and works 100% of the time. With the given overloads above you might have an inconsistency in behavior and not even realize it, or worse yet you may accidentally introduce a change in one and not in the other copies.

If you can do it with IEnumerable<T> then use that, if not then use the least interface needed.

csharptest.net
What you're saying makes sense, though I do think there is justification for utilizing certain optimizations if the supplied argument(s) may be cast to certain types, as suggested by the other answers, for better performance. It seems to me the best compromise is to make the "lowest common denominator" method the only public one, and to potentially utilize more performant overloads internally where appropriate.
Dan Tao
+1  A: 

I'd say it's uncommon, and potentially confusing, so would be unlikely to be a good choice for a public API.

You could accept an IEnumerable<T> parameter, and internally check if it is in fact an ICollection<T> or IList<T>, and optimize accordingly.

This might be analagous to some of the optimizations in some of the IEnumerable<T> extension methods in the .NET 3.5 Framework.

Joe
"... and internally check if it is..." the only problem with this approach is that it too creates more code paths. Keep it simple and optimize only when the profiler tells you to should be your default answer.
csharptest.net
@csharptest.net: That is very bold advice to give without any qualifications, like "assuming you're not writing high-performance code" (not worrying about performance until it's absolutely necessary may be fine in many situations, but if performance is always critical then you need to always be thinking about it).
Dan Tao
If you're so concerned about performance, you can get out of a virtual machine, then! That's another discussion topic.
Simón
@csharptest.net: On the other hand having too many overloads in the public API has its disadvantages as well.
Brian Gideon
@Simón: I think there's a difference between being "so concerned about performance" and simply wanting to do things efficiently, especially where it's perfectly obvious how to. Sometimes I get the feeling that certain developers actually look *down* on those who try to keep performance in mind, which is just absurd. But that is *also* another discussion topic.
Dan Tao
+5  A: 

If you have a look at how LINQ extension methods are implemented using Reflector, you can see that a few extension methods on IEnumerable<T>, such as Count(), attempt to cast the sequence to an ICollection<T> or an IList<T> to optimize the operation (for example, using the ICollection<T>.Count property instead of iterating through an IEnumerable<T> and counting the elements). So your best bet is most likely to accept an IEnumerable<T> and then do this kind of optimizations if ICollection<T> or IList<T> are available.

Trillian
Please forgive a probably "naive" follow-up on your obviously excellent answer : when you say "then do this kind of optimizations if ICollection<T> or IList<T> are available," that suggests to me there could be cases where ICollection<T> and IList<T> are not available, and I am trying to imagine what those cases could be. If you care to clarify, that would be most appreciated. thanks,
BillW
@BillW: Just because an object implements `IEnumerable<T>` it doesn't follow that it implements either of those other interfaces. I could write my own class that lets you enumerate over it but doesn't provide `Add`, `Contains`, `Remove`, etc. This is the scenario Trillian is talking about.
Dan Tao
"I am trying to imagine what those cases could be" - the most common case is "lazy iteration" using the yield keyword in an iterator block.
Joe
@BillW: A good example is if you're doing .Where(...).Last(), then Last gets the enumerator which is the result of the Where, and that is nothing more than an IEnumerable<T>, you can't get an ICollection<T> from it. That's only an example, don't use .Where(...).Last() as there's an overload for Last which accepts a filter delegate.
Trillian
+2  A: 

I think one version accepting IEnumerable<T> would be the way to go, and check inside the method if the parameter is one of the more derived collection types. With three versions as you propose, you lose the efficiency benefit if someone passes you a (runtime) IList<T> that the compiler statically considers an IEnumerable<T>:

        IList<string> stringList = new List<string> { "A", "B", "C" };
        IEnumerable<string> seq = stringList;
        Extensions.GetMiddle(stringList); // calls IList version
        Extensions.GetMiddle(seq);        // calls IEnumerable version
Wayne
Good point! I think you're right; a single public method and (potentially) multiple private method overloads is the way to go.
Dan Tao
A: 

No. It's certainly uncommon.

Anyway. Since IList<T> inherits from ICollection<T> and IEnumerable<T>, and ICollection<T> inherits from IEnumerable<T>, your only concern would be performance in IEnumerable<T> types.

I just see no reason to overload the function in that way, providing different signatures to achieve exactly the same result and accepting exactly the same types as parameter (no matter if you have an IEnumerable<T> or IList<T>, you would be able to pass it to any of the three overloads); that would just cause confusion.

When you overload a function, is just to provide a way to pass a different type of parameter that you cannot pass to the function if it would not have that signature.

Don't optimize unless it's really necessary. If you want to optimize, do it undercover. You won't pretend someone using your class to be aware of that "optimization" in order to decide which method signature to use, right?

Simón
A: 

I am really indifferent. If I saw it your way I would not think anything of it. But Joe's idea has merit. It might look like the following.

public static T GetMiddle<T>(IEnumerable<T> values)
{
  if (values is IList<T>) return GetMiddle((IList<T>)values);
  if (values is ICollection<T>) return GetMiddle((ICollection<T>)values);

  // Use the default implementation here.
}

private static T GetMiddle<T>(ICollection<T> values)
{
}

private static T GetMiddle<T>(IList<T> values)
{
}
Brian Gideon
Yep, that's pretty much what I had in mind after reading some of the other responses. Thanks!
Dan Tao