views:

175

answers:

6

I'm in the middle of reading Code Complete, and towards the end of the book, in the chapter about refactoring, the author lists a bunch of things you should do to improve the quality of your code while refactoring.

One of his points was to always return as specific types of data as possible, especially when returning collections, iterators etc. So, as I've understood it, instead of returning, say, Collection<String>, you should return HashSet<String>, if you use that data type inside the method.

This confuses me, because it sounds like he's encouraging people to break the rule of information hiding. Now, I understand this when talking about accessors, that's a clear cut case. But, when calculating and mangling data, and the level of abstraction of the method implies no direct data structure, I find it best to return as abstract a datatype as possible, as long as the data doesn't fall apart (I wouldn't return Object instead of Iterable<String>, for example).

So, my question is: is there a deeper philosophy behind Code Complete's advice of always returning as specific a data type as possible, and allow downcasting, instead of maintaining a need-to-know-basis, that I've just not understood?

A: 

I could see how, in some cases, having a more specific data type returned could be useful. For example knowing that the return value is a LinkedList rather than just List would allow you to do a delete from the list knowing that it will be efficient.

Marc Novakowski
Yes, in some cases there is a good reason in returning a LinkedList. But I would change the return type once the specific need arises, after a good consideration, and not by default.
Henrik Paul
Thats the wrong approach, you should always return List. If you know that clients will be deleting a lot from your list then its sensible to return a LinkedList.Returning a concrete type rather than an interface is most of the time never the right thing to do. Methods are about contracts and the contract says im returning a list of X. How the list does its thing is immaterial.
mP
Yes, but what if (for example), the client has to iterate through the list? By making the signature (and thus docs) use LinkedList as the return type, the client knows they should not treat the list as a random access list. Thus, even if all lists have List.get(int index), they will know to avoid using that method with the list you're returning. Basically, the specified return type should provide information when the caller has a legitimate need to know.
Matthew Flaschen
+1  A: 

Can't find any evidence to substantiate my claim but the idea/guideline seems to be:

Be as lenient as possible when accepting input. Choose a generalized type over a specialized type. This means clients can use your method with different specialized types. So an IEnumerable or an IList as an input parameter would mean that the method can run off an ArrayList or a ListItemCollection. It maximizes the chance that your method is useful.

Be as strict as possible when returning values. Prefer a specialized type if possible. This means clients do not have to second-guess or jump through hoops to process the return value. Also specialized types have greater functionality. If you choose to return an IList or an IEnumerable, the number of things the caller can do with your return value drastically reduces - e.g. If you return an IList over an ArrayList, to get the number of elements returned - use the Count property, the client must downcast. But then such downcasting defeats the purpose - works today.. won't tomorrow (if you change the Type of returned object). So for all purposes, the client can't get a count of elements easily - leading him to write mundane boilerplate code (in multiple places or as a helper method)

The summary here is it depends on the context (exceptions to most rules). E.g. if the most probable use of your return value is that clients would use the returned list to search for some element, it makes sense to return a List Implementation (type) that supports some kind of search method. Make it as easy as possible for the client to consume the return value.

Gishu
A: 

I think, while designing interfaces, you should design a method to return the as abstract data type as possible. Returning specific type would make the purpose of the method more clear about what they return.

Also, I would understand it in this way:

Return as abstract a data type as possible = return as specific a data type as possible

i.e. when your method is supposed to return any collection data type return collection rather than object.

tell me if i m wrong.

24x7Programmer
A: 

A specific return type is much more valuable because it:

  1. reduces possible performance issues with discovering functionality with casting or reflection
  2. increases code readability
  3. does NOT in fact, expose more than is necessary.

The return type of a function is specifically chosen to cater to ALL of its callers. It is the calling function that should USE the return variable as abstractly as possible, since the calling function knows how the data will be used.

Is it only necessary to traverse the structure? is it necessary to sort the structure? transform it? clone it? These are questions only the caller can answer, and thus can use an abstracted type. The called function MUST provide for all of these cases.

If,in fact, the most specific use case you have right now is Iterable< string >, then that's fine. But more often than not - your callers will eventually need to have more details, so start with a specific return type - it doesn't cost anything.

Jeff Meatball Yang
+1  A: 

Most of the time one should return an interface or perhaps an abstract type that represents the return value being returned. If you are returning a list of X, then use List. This ultimately provides maximum flexibility if the need arises to return the list type.

Maybe later you realise that you want to return a linked list or a readonly list etc. If you put a concrete type your stuck and its a pain to change. Using the interface solves this problem.

@Gishu

If your api requires that clients cast straight away most of the time your design is suckered. Why bother returning X if clients need to cast to Y.

mP
what language are you in where you have to cast a more specific type to a super-type?
Pete Kirkham
mP
+3  A: 

I think it is simply wrong for the most cases. It has to be: be as lenient as possible, be as specific as needed

In my opinion, you should always return List rather than LinkedList or ArrayList, because the difference is more an implementation detail and not a semantic one. The guys from the Google collections api for Java taking this one step further: they return (and expect) iterators where that's enough. But, they also recommend to return ImmutableList, -Set, -Map etc. where possible to show the caller he doesn't have to make a defensive copy.

Beside that, I think the performance of the different list implementations isn't the bottleneck for most applications.

Tim Büthe
I think I agree with this. Basically, tell the callers everything they need to know and no more. However, I may disagree with your specific example. In /some/ cases, the caller does need to know whether they're getting an ArrayList or a LinkedList. They have very different performance characteristics.
Matthew Flaschen
@Matthew If these performance differeces are an issue, then it it is not enough to know the concrete type, the caller must be able to select it (e.g. by passing in an empty collection).
starblue
I totally agree with starblue.
Tim Büthe