views:

403

answers:

6

I wrote this handy, generic function for converting a collection of collections into a single set:

public static <T> Set<T> makeSet(Collection<Collection<T>> a_collection) {
 Iterator<Collection<T>> it = a_collection.iterator();
 Set<T> result = new HashSet<T>();
 while (it.hasNext()) {
  result.addAll(it.next());
 }
 return result;
}

Then I tried to call it:

    List<List<String>> resultLists = ... ;
    Set<String> labelsSet = CollectionsHelper.makeSet(resultLists);

and I received the following error:

<T>makeSet(java.util.Collection<java.util.Collection<T>>) in CollectionsHelper 
cannot be applied to (java.util.List<java.util.List<java.lang.String>>)

Now a List is a Collection, and a String is a T. So why doesn't this work and how do I fix it?

+4  A: 

Nope, it's not.

I would change the declaration to be

public static <T> Set<T> makeSet(Collection<? extends Collection<T>> a_collection) {
    ....
}

Two generic types can be subtypes only if the type arguments are identical (or with a wildcards, so Collection<String> is not a subtype of Collection<Object>. Check the Subtyping section of Generics tutorial.

notnoop
+7  A: 

Your signature should be:

public static <T> Set<T> makeSet(Collection<? extends Collection<T>> coll);

Basically List<S> is not a subtype of List<T> just because S is a subtype of T. That property is called covariance and, in Java, generic types are not covariant (other languages such as scala contain covariant generic types).

What you did didn't work because it should be possible to add any Collection<T> into a Collection<Collection<T>>, so for example, with your signature, this would be a valid implementation:

public static <T> Set<T> makeSet(Collection<Collection<T>> coll) {
    coll.add(new HashSet<T>());
    return null;
}

But then calling this method as follows:

List<List<String>> outside = new LinkedList<List<String>>();
makeSet(outside); //actually this line will not compile!
List<String> oops = outside.get(0); //oh dear - it's a HashSet

So does this lead to the same problem? NO! The reason being that the compiler will not let you add anything into a collection parameterized on an unknown type:

public static <T> Set<T> makeSet(Collection<? extends Collection<T>> coll) {
    coll.add(new HashSet<T>()); //this line will not compile
    return null;
}

Having wildcards was necessary in the first place so that you could do things like what you wanted to do, probably best demonstrated by how the Collection.addAll method was generified so that List<Number>.addAll(List<Integer>) would be allowed:

boolean addAll(Collection<? extends T> coll)
oxbow_lakes
If Collection were a concept rather than a base class, all this would be avoided. Or if I could tell the compiler that a_collection was const. I would prefer the former.
Mark Ruzon
+3  A: 

This is a specialized version of the more generalized question, "Is a Collection<Circle> a kind of Collection<Shape>?"

The answer is a (perhaps surprising) no.

The reasoning is well-stated in a C++ context at in the C++ FAQ. This is a general OO question, so the same general reasoning applies.

For example, consider an alternate universe where a Collection<Circle> is a kind-of Collection<Shape>. In this universe, you could do something like this:

Collection<Circle> circles = new Collection<Circle>();
Collection<Shape> shapes = circles; // OK, we're in an alternate universe
shapes.Add(new Circle()); // OK, we're adding a circle to a collection of circles
shapes.Add(new Square()); // Asplode!  We just added a square to a collection of circles.

What happens when the Square, a Shape, is added to the collection of shapes, which is really a collection of circles? There's no good answer.

The same reasoning applies to a Collection<List<T>> and a Collection<Collection<T>>. A Collection<List<T>> is not a kind-of Collection<Collection<T>> because it isn't substitutable for a Collection<Collection<T>>. A Queue<T> can be added to a collection of collections, but it cannot be added to a collection of List<T>.

Greg D
But my code isn't adding anything to the original collection; it is merely transforming the data into a separate data structure. This is a good explanation for why I usually avoid inheritance.
Mark Ruzon
Your code isn't changing it, but that doesn't change the fact that the concept is invalid for this reason. A Collection<Derived> is not substitutable for a Collection<Base>, therefore it is not a subclass of a Collection<Base>. This is fundamental OO theory that runs through many languages. It's more fundamental than your single example.
Greg D
+4  A: 
public static <T> Set<T> makeSet(Collection<? extends Collection<T>> a_collection) {
    Iterator<? extends Collection<T>> it = a_collection.iterator();
    Set<T> result = new HashSet<T>();
    while (it.hasNext()) {
            result.addAll(it.next());
    }
    return result;
}
Bozho
Three answers led to progress, but this one was the most complete. I will be investigating the "? extends T" concept very soon.
Mark Ruzon
Turns out I already did: http://blogs.sun.com/CoreJavaTechTips/entry/using_generics_with_wildcards_and. How embarrassing.
Mark Ruzon
A: 

This is he kind of stuff that led Java 1.0 to be developed to simplify all the dumb C++ templating that was happening. You add 5 layers of complication to avoid one stupid cast from a collection of objects to a set of your specific instance. Ok fine if you find that you are casting all over the place and making your code ugly, but really I bet this happens about once in 500k lines of code. Ya, ya, it is good that we can find these kinds of technical details, but is your code really becoming more maintainable when you start going down this path?

Zak
Don't hold back, tell us what you _really_ think.
Greg D
@Greg not being self-righteous, this *could* make some section of code a lot cleaner. I just really hope the OP is truly thinking about first making his code human readable, then optimizing it to avoid casts if necessary. The whole point of adding Generics (after effectively removing in from C++ style syntax everywhere else) was to clean up tons of *repetitive* casting. The point of removing it in the first place and making everything objects was to remove all the unneccessary and repetative *templating* . We need to try to remain clear on why we are doing what we are doing in code.
Zak
+2  A: 

I almost hate to post the correct answer, because it's so ugly, but since the three top answers have missed this, I feel compelled.

public static <T> Set<T> makeSet(
    Collection<? extends Collection<? extends T>> coll)

You read that right. Two "? extends"'s. Otherwise, you cannot put a List<Integer> and a List<Double> together to get a Set<Number>, which should logically be possible.

Once you get to generics nested inside generics, things always get nasty.

You can totally be forgiven for opting for the simpler answer instead. :) Just know that it won't always work where it logically should.

Incidentally, with Google Collections, you can use Iterables.concat(...) to do this, or ImmutableSet.copyOf(Iterables.concat(...)) if you need the de-duping.

Kevin Bourrillion