ansaurus

Question

Answer 1

A:

I'm not sure I agree with your statement...

A simple loop adds complexity, and is less "readable and obvious" than a call with a sensible name.

Refactoring evangelists would claim that your goal should generally be to create flat and short functions that call other operations. While addAll, find, and such methods are easy to implement yourself, avoiding them would require the reader to grasp something more complex than a single word, and may cause code replication.

IMHO, CollectionUtils actually presents cleaner operations than the standard Java collection library.

Uri 2009-01-25 01:17:41

Although, in principle, I agree, some would argue that a call to CollectionUtils is less readable than a simple loop. Don't misunderstand me, there are some calls such as isEmpty that are very useful IMO.

javamonkey79 2009-01-25 02:51:09

Don't get me wrong, it's not like I always use the calls either... :)

Uri 2009-01-25 04:06:25

Answer 2

+1 A:

Although I agree with Uri in principle, without closures or function literals or whatever, Java imposes a pretty high syntactic cost to actually use methods like collect, transform, etc. In a lot of cases, the actual lines of code is the same or greater than if you had written the simple loop. addAll, removeAll, and all their friends that don't take function objects as arguments are indispensible though.

All of this also applies to the equally good Google Collections API.

It's sad that Sun has the power to fix these issues in Java 7, but it appears they won't. Cowards.

Dave Ray 2009-01-25 01:27:20

Yeah, you read my mind. I agree with Uri in principle as well, but have not found a good OO case where anything is saved.

javamonkey79 2009-01-25 02:49:27

Offensive!? Seriously? Did I hurt Sun's feelings? This place is really humorless today.

Dave Ray 2009-01-25 13:08:08

Answer 3

A:

While we don't use the CollctionUtils we have implemented a few similar utilities our selves and from those we frequently use

empty(Collection c)

to test collections, strings and such for emptyness

has(...)

that only returns !empty(...)

mapToProperty(Collection c, String property, Class newType)

this maps a collection of T1 to a collection of T2 using reflection to call "property"

implode(Collection c, String sep)

A sep seperated string with the elements of c

John Nilsson 2009-01-25 01:53:13

Answer 4

+2 A:

I think the key issue is designing/coding for flexibility.

If you have a single use case (e.g. selecting the members of a collection that satisfy some specific condition), then coding a relatively simple loop by hand works. On the other hand...

Suppose that the set of possible conditions grew large, or could even be composed on-the-fly at run-time (even dynamically, based on user input/interaction). Or suppose that there were a few very complex conditions which could be composed with operators (e.g. A and B, C and not D, etc.) into even more cases.

Suppose that, having made the selection, there was some other processing that was to be done on the resulting collection.

Now consider the structure of the code that might result from a brute-force, in-line approach to writing the above: an outer loop containing a complex decision process to determine which test(s) to perform, mixed together with code that does one or more things with the "surviving" members of the collection. Such code tends to be (and especially to become over time with maintenance) difficult to understand and difficult to modify without the risk of introducing defects.

So the point is to pursue a strategy in which each aspect:

basic "select something" process,
predicates that express elementary criteria,
combining operators that compose predicates, and
transformers that operate on values,

can be coded and tested independently, then snapped together as needed.

joel.neely 2009-01-25 04:24:53

Answer 5

+1 A:

collect() is useful when you have possible alternative representations of your objects.

For example, recently I was dealing with a piece of code that needed to match lists of objects from two different sources. These objects were of different classes as they were used at different points in the code, but for my purposes had the same relevant concepts (i.e. they both had an ID, both had a property path, both had a "cascade" flag etc.).

I found that it was much easier to define a simple intermediate representation of these properties (as an inner class), define transformers for both concrete object classes (again very simple as it's just using relevant accessor methods to get the properties out), and then use collect() to convert my incoming objects into the intermediate representation. Once they're there, I can use standard Collections methods to compare and manipulate the two as sets.

So as a (semi-)concrete example, let's say I need a method to check that the set of objects in the presentation layer is a subset of the objects cached in the data layer. With the approach outlined above this would be done something like this:

public boolean isColumnSubset(PresSpec pres, CachedDataSpec dataSpec)
{
   final List<IntermediateRepresentation> presObjects = CollectionUtils.collect(pres.getObjects(), PRES_TRANSFORMER);
   final List<IntermediateRepresentation> dataObjects = CollectionUtils.collect(dataSpec.getCached(), DATA_TRANSFORMER);

   return dataObjects.containsAll(presObjects);
}

To me this is much more readable, with the last line conveying a real sense of what the method is doing, than the equivalent with loops:

public boolean isColumnSubset(PresSpec pres, CachedDataSpec dataSpec)
{
   for (PresSpecificObject presObj : pres.getObjects())
   {
      boolean matched = false;
      for (CachedDataObject dataObj : dataSpec.getCached())
      {
         if (areObjectsEquivalent(presObj, dataObj)) // or do the tests inline but a method is cleaner
         {
            matched = true;
            break;
         }
      }

      if (matched == false)
      {
         return false;
      }
   }

   // Every column must have matched
   return true;
}

The two are probably about as efficient, but in terms of readability I'd say that the first one is much easier to immediately understand. Even though it comes in being more lines of code overall (due to defining an inner class and two transformers), the separation of the traversal implementation from the actual "true or false" logic makes the latter much clearer. Plus if you have any KLOC metrics it can't be bead either. ;-)

Andrzej Doyle 2009-02-04 11:45:37

Yeah, I think you are hitting on the point that I was trying to get at...in an OO sense, these methods seem like they can be really useful.Nice answer +1 :D

javamonkey79 2009-02-05 04:03:31

ansaurus

tags:

views:

answers:

Good uses for Apache CollectionUtils

related questions