views:

301

answers:

2

In Java, the containsAll and retainAll in the AbstractCollection class explicitly state that cardinality is not respected, so in other words it does not matter how many instances of a value are on each side. Since all Java collections in the standard library extend AbstractCollection, it is assumed that all of them work the same.

However, the documentation of these methods in the Collection interface does not say anything. Is one supposed to infer from AbstractCollection, or was this left unspecified on purpose to allow one to define collections that work differently?

For example, Bag in apache-collections explicitly states that it does respect cardinality, and claims that it violates the contract of the version from Collection (even though it doesn't really).

So, what are the semantics of these operations in Collection rather than in AbstractCollection?

Edit: Tho those who are wondering about why I would care, it's because as part of my Ph.D. work I demonstrated that developers don't expect the conformance violation in Apache, but I'm trying to understand why the Collection interface was left so ambiguous.

A: 

I don't think Collection defines it this way or the other, but it simply became sort of a convention to follow AbstractCollection behavior, for example google-collections do: see their Multiset documentation (Multiset is what they call a Bag)

Yardena
The problem is that the convention might cause misunderstandings of the interface. For example, the documentation of Bag in Apache says that this is stated in Collection, even though it is stated in AbstractCollection. It's generally not a good idea to infer interface contract from the typical behavior of an implementation...
Uri
I agree with you completely. This is a de facto, rather than de jure contract. So I think it's Ok to implement a collection differently, but emphasize the difference in Javadoc, just as Apache Collections did.
Yardena
+1  A: 

The javadocs for containsAll (in Collection) say:

Returns: true if this collection contains all of the elements in the specified collection

and for retainAll (in Collection):

Retains only the elements in this collection that are contained in the specified collection (optional operation). In other words, removes from this collection all of its elements that are not contained in the specified collection.

I read containsAll's contract to mean that calling a.containsAll(b) will return true, if and only if, calling a.contains(bElem) for each element bElem in b would return true. I would also take it to imply that a.containsAll(someEmptyCollection) would also return true. As you state the javadocs for AbstractCollection more explicitly state this:

This implementation iterates over the specified collection, checking each element returned by the iterator in turn to see if it's contained in this collection. If all elements are so contained true is returned, otherwise false.

I agree that the contact for Collection for containsAll sould be more explicit to avoid any possiblity for confusion. (And that the reading of the javadocs for AbstractCollection should NOT have been necessary to confirm ones understanding of Collection)

I would not have made an assumption with regard to number of duplicate elements after a call to retainAll. The stated contract in Collection (by my reading) doesn't imply either way how duplicates in either collection would be handled. Based on my reading of retainAll in collection multiple possible results of a.retainAll(b) are all reasonable:

  1. result contains 1 of each element that has at least one copy in both a and b
  2. result contains each element (including duplicates) that was in a, except those that are not in b
  3. or even, result contains somewhere between 1 and the number of copies found in a of each element in a, except those not in b. I would have expected either #1 or #2, but would assume any of the the three to be legal based on the contract.

The javadocs for AbstractCollection confirm that it uses #2:

This implementation iterates over this collection, checking each element returned by the iterator in turn to see if it's contained in the specified collection. If it's not so contained, it's removed from this collection with the iterator's remove method

Although since this isn't in my reading of the original Collection interface's contract, I wouldn't necessarily assume the behavior of Collection to generally be this way.

Perhaps you should consider submitting suggested updates to the JavaDoc once you're done.

As to 'why the Collection interface was left so ambiguous' - I seriously doubt it was intentionally done - probably just something that wasn't given its due priority when that part of the API's were being written.

kenj0418