views:

1194

answers:

5

I'd like to have a SortedSet of Collections (Sets themselves, in this case, but not necessarily in general), that sorts by Collection size. This seems to violate the proscription to have the Comparator be consistent with equals() - i.e., two collections may be unequal (by having different elements), but compare to the same value (because they have the same number of elements).

Notionally, I could also put into the Comparator ways to sort sets of equal sizes, but the use of the sort wouldn't take advantage of that, and there's not really a useful + intuitive way to compare the Collections of equal size (at least, in my particular case), so that seems like a waste.

Does this case of inconsistency seem like a problem?

+1  A: 

There is no reason why a Comparator should return the same results as equals(). In fact, the Comparator API was introduced because equals() just isn't enough: If you want to sort a collection, you must know whether two elements are lesser or greater.

Aaron Digulla
This answer seems inconsistent with the javadoc for SortedSet: http://java.sun.com/j2se/1.5.0/docs/api/java/util/SortedSet.htmlCan you elaborate?
Carl
@Carl: Comparator != Comparable ... see my answer
Stephen C
+1  A: 

SortedSet interface extends core Set and thus should conform to the contract outlined in Set specification.

The only possible way to achieve that is to have your element's equal() method behavior be consistent with your comparator - the reason for that is that core Set operates based on equality whereas SortedSet operates based on comparison.

For example, add() method defined in core Set interface specifies that you can't add an element to the set if there already is an element whose equal() method would return true with this new element as argument. Well, SortedSet doesn't use equal(), it uses compareTo(). So if your compareTo() returns false your element WILL be added even if equal() were to return true, thus breaking Set contract.

None of this is a practical problem per se, however. SortedSet behavior is always consistent, even if compare() vs equal() are not.

ChssPly76
I'm more concerned about the opposite direction in this case - I want to add elements to a set when equals() is false, even if compareTo returns 0. Are there any implementations out there that do that?
Carl
Set implementations? No. Don't return zero from compareTo() if equals() returns false and you'll solve all your problems. You'll need to return consistent result, obviously, but that shouldn't be too hard (worst comes to worst, you can compare Set's hashcodes).Performance may suffer a little but (how common is this case, really?) but you'll have consistent semantics for both Set and SortedSet.
ChssPly76
A: 

It's a little bit odd that SortedSet as a part of the standard API breaks the contract defined in the Set interface and uses the Comparator to define equality instead of the equals method, but that's how it is.

If your actual problem is to sort a collection of collections according to the containted collections' size, you are better of with a List, which you can sort using Collections.sort(List, Comparator>);

jarnbjo
+1  A: 

As ChssPly76 wrote in a comment, you can use hashCode to decide the compareTo call in the case where two Collections have the same size but are not equal. This works fine, except in the rare case where you have two Collections with the same size, are not equal, but have the same hashCode. Admittedly, the chances of that happening are pretty small, but it is conceivable. If you want to be really careful, instead of hashCode, use System.identityHashCode instead. This should give you a unique number for each Collection, and you shouldn't get collisions.

At the end of the day, this gives you the functionality of having the Collections in the Set sorted by size, with arbitrary ordering in the case of two Collections with matching size. If this is all you need, it's not much slower than the usual comparison would be. If you need the ordering to be consistent between different JVM instances, this won't work and you'll have to do it some other way.

pseudocode:

if (a.equals(b)) {
    return 0;
} else if (a.size() > b.size()) {
    return 1;
} else if (b.size() > a.size()) {
    return -1;
} else {
    return System.identityHashCode(a) > System.identityHashCode(b) ? 1 : -1;
}
joe p
nope on the JVM issue, since the order of the same-sized sets is irrelevant. I was hoping to avoid this sort of solution, but looks like there's no way out of it.
Carl
+1  A: 

This seems to violate the proscription to have the Comparator be consistent with equals() - i.e., two collections may be unequal (by having different elements), but compare to the same value (because they have the same number of elements).

There is no requirement, either stated (in the Javadoc) or implied, that a Comparator be consistent with an object's implementation of boolean equals(Object).

Note that Comparable and Comparator are distinct interfaces with different purposes. Comparable is used to define a 'natural' order for a class. In that context, it would be a bad idea for equals and compateTo to be inconsistent. By contrast, a Comparator is used when you want to use a different order to the natural order of a class.

EDIT: Here's the complete paragraph from the Javadoc for SortedSet.

Note that the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals if the sorted set is to correctly implement the Set interface. (See the Comparable interface or Comparator interface for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a sorted set performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the sorted set, equal. The behavior of a sorted set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

I have highlighted the final sentence. The point is that such a SortedSet will work as you would most likely expect, but the behavior of some operations won't exactly match the Set specification ... because the specification defines their behavior in terms of the equals method.

So in fact, there is a stated requirement for consistency (my mistake), but the consequences of ignoring it are not as bad as you might think. Of course, it is up to decide if you should do that. In my estimation, it should be OK, provided that you comment the code thoroughly and make sure that the SortedSet does not 'leak'.

However, it is not clear to me that a Comparator for collections that only looks at an collections "size" is going to work ... from a semantic perspective. I mean, do you really want to say that all collections with (say) 2 elements are equal? This will mean that your set can only ever contain one collection of any given size ...

Stephen C
The javadoc states: "the ordering maintained by a sorted set (whether or not an explicit comparator is provided) must be consistent with equals" and consistent with equals is defined as (compare(e1,e2)==0)==(equals(e1,e2)). I'm not sure how to resolve the difference between your opening statement and the one in the javadoc. Can you clarify?
Carl