views:

13021

answers:

12

I want to filter a java.util.Collection based on a predicate.

+6  A: 

org.apache.commons.collections.CollectionUtils#filter(Collection,Predicate)

Kevin Wong
this is okay, but it's no generic, and modifies the collection in place (not nice)
Kevin Wong
There are other filter methods in CollectionUtils that do not modify the original collection.
skaffman
In particular, the method that does *not* modify the collection in place is org.apache.commons.collections.CollectionUtils#select(Collection,Predicate)
Eero
+11  A: 

Consider Google Collections for an updated Collections framework that supports generics.

Heath Borders
http://code.google.com/p/google-collections/
John Topley
I actually had that in the comment, but I left out "http://"
Heath Borders
ya, I knew about the Google collections lib. The version I was using didn't have Collections2 in it. I added a new answer to this question that lists the specific method.
Kevin Wong
Kevin, Iterables.filter() and Iterators.filter() have been there from the beginning, and are usually all you need.
Kevin Bourrillion
+2  A: 

Are you sure you want to filter the Collection itself, rather than an iterator?

see org.apache.commons.collections.iterators.FilterIterator

ykaganovich
+2  A: 

The setup:

public interface Predicate<T> {
  public boolean filter(T t);
}

void filterCollection(Collection<T> col, Predicate<T> predicate) {
  for (Iterator i = col.iterator(); i.hasNext();) {
    T obj = i.next();
    if (predicate.filter(obj)) {
      i.remove();
    }
  }
}

The usage:

List<MyObject> myList = ...;
filterCollection(myList, new Predicate<MyObject>() {
  public boolean filter(MyObject obj) {
    return obj.shouldFilter();
  }
});
jon
Fine, but I prefer Alan implementation because you get a copy of the collection instead of altering it. Moreover, Alan's code is thread safe while yours is not.
marcospereira
+3  A: 

"Best" way is too wide a request. Is it "shortest"? "Fastest"? "Readable"? Filter in place or into another collection?

Simplest (but not most readable) way is to iterate it and use Iterator.remove() method:

Iterator<Foo> it = col.iterator();
while( it.hasNext() ) {
  Foo foo = it.next();
  if( !condition(foo) ) it.remove();
}

Now, to make it more readable, you can wrap it into a utility method. Then invent a IPredicate interface, create an anonymous implementation of that interface and do something like:

CollectionUtils.filterInPlace(col,
  new IPredicate<Foo>(){
    public boolean keepIt(Foo foo) {
      return foo.isBar();
    }
  });

where filterInPlace() iterate the collection and calls Predicate.keepIt() to learn if the instance to be kept in the collection.

I don't really see a justification for bringing in a third-party library just for this task.

Vladimir Dyuzhev
+24  A: 

Assuming that you are using Java 1.5, and that you cannot add Google Collections, I would do something very similar to what the Google guys did. This is a slight variation on Jon's comments.

First add this interface to your codebase.

public interface Predicate<T> { boolean apply(T type); }

Its implementors can answer when a certain predicate is true of a certain type. E.g. If T were User and AuthorizedUserPredicate<User> implements Predicate<T>, then AuthorizedUserPredicate#apply returns whether the passed in User is authorized.

Then in some utility class, you could say

public static <T> Collection<T> filter(Collection<T> target, Predicate<T> predicate) {
    Collection<T> result = new ArrayList<T>();
    for (T element: target) {
        if (predicate.apply(element)) {
            result.add(element);
        }
    }
    return result;
}

So, assuming that you have the use of the above might be

Predicate<User> isAuthorized = new Predicate<User>() {
    public boolean apply(User user) {
        // binds a boolean method in User to a reference
        return user.isAuthorized();
    }
};
// allUsers is a Collection<User>
Collection<User> authorizedUsers = filter(allUsers, isAuthorized);

If performance on the linear check is of concern, then I might want to have a domain object that has the target collection. The domain object that has the target collection would have filtering logic for the methods that initialize, add and set the target collection.

Alan
Yeah, but I hate to reinvent the wheel, again, repeatedly. I'd rather find some utility library that does when I want.
Kevin Wong
^ ...that does what I want ^
Kevin Wong
This isn't the best way in case you don't want the new collection. Use the filter iterator metaphor, which may input into a new collection, or it may be all that you a need.
Josh
Thanks Alan! This was key!
Shiftbit
+1  A: 
com.google.common.collect.Collections2#filter(Collection,Predicate)

in Google Collections

Kevin Wong
A: 

Use the jbfilter framework : http://code.google.com/p/jbfilter/

+4  A: 

With the ForEach DSL you may write

import static ch.akuhn.util.query.Query.select;
import static ch.akuhn.util.query.Query.$result;
import ch.akuhn.util.query.Select;

Collection<String> collection = ...

for (Select<String> each : select(collection)) {
    each.yield = each.value.length() > 3;
}

Collection<String> result = $result();

Given a collection of [The, quick, brown, fox, jumps, over, the, lazy, dog] this results in [quick, brown, jumps, over, lazy], ie all strings longer than three characters.

All iteration styles supported by the ForEach DSL are

  • AllSatisfy
  • AnySatisfy
  • Collect
  • Counnt
  • CutPieces
  • Detect
  • GroupedBy
  • IndexOf
  • InjectInto
  • Reject
  • Select

For more details, please refer to https://www.iam.unibe.ch/scg/svn_repos/Sources/ForEach

Adrian
That's pretty clever! A lot of work to implement a nice Ruby-ish syntax though! The negative is that your filter is not a first-class function and hence cannot be re-used. Roll on closures...
oxbow_lakes
Good point. One way to reuse the loop body is by refactoring the loop into a method that takes the selection query as parameter. That is however by far not as handy and powerful as real closures, for sure.
Adrian
+6  A: 

lambdaj allows to filter collections without writing loops or inner classes as in the following example:

List<Person> beerDrinkers = select(persons, having(on(Person.class).getAge(),
    greaterThan(16)));

Can you imagine something more readable? You can find it here:

http://code.google.com/p/lambdaj/

Mario Fusco
Wow! Thanks for this answer, this is definitely going to be helpful!
Sandman
+1  A: 

This, combined with the lack of real closures, is my biggest gripe for Java. Honestly, most of the methods mentioned above are pretty easy to read and REALLY efficient; however, after spending time with .Net, Erlang, etc... list comprehension integrated at the language level makes everything so much cleaner. Without additions at the language level, Java just cant be as clean as many other languages in this area.

If performance is a huge concern, Google collections is the way to go (or write your own simple predicate utility). Lambdaj syntax is more readable for some people, but it is not quite as efficient.

And then there is a library I wrote. I will ignore any questions in regard to its efficiency (yea, its that bad)...... Yes, i know its clearly reflection based, and no I don't actually use it, but it does work:

LinkedList<Person> list = ......
Iterable<Person> filtered = 
                        Query.from(list).where("x=> x.Age >= 21 && x.Age <= 50");
jdc0589
A: 

I wrote an extended Iterable class that support applying functional algorithms without copying the collection content.

Usage:

List<Integer> myList = new ArrayList<Integer>(){ 1, 2, 3, 4, 5 }

Iterable<Integer> filtered = Iterable.wrap(myList).select(new Predicate1<Integer>()
{
    public Boolean call(Integer n) throws FunctionalException
    {
        return n % 2 == 0;
    }
})

for( int n : filtered )
{
    System.out.println(n);
}

The code above will actually execute

for( int n : myList )
{
    if( n % 2 == 0 ) 
    {
        System.out.println(n);
    }
}
Vincent Robert