views:

127

answers:

5

Hello,

I have a List which contains a list of objects and I want to remove from this list all the elements which have the same values in two of their attributes. I had though about doing something like this:

List<Class1> myList;
....
Set<Class1> mySet = new HashSet<Class1>();
mySet.addAll(myList);

and overriding hash method in Class1 so it returns a number which depends only in the attributes I want to consider.

The problem is that I need to do a different filtering in another part of the application so I can't override hash method in this way (I would need two different hash methods).

What's the most efficient way of doing this filtering without overriding hash method?

Thanks

+3  A: 

Let your Class1 implements Comparable. Then use TreeSet as in your example (i.e. use addAll method).

Roman
TreeSet was also my first thought. Though implements Comparable is not required. The possibility of a custom comparator in the TreeSet constructor makes it the perfect tool for the job.
extraneon
+1  A: 

I would suggest introducing a class for the concept of the parts of Class1 that you want to consider significant in this context. Then use a HashSet or HashMap.

Tom Hawtin - tackline
As I understand the problem the hash is already used somewhere else, and the filtering now needs to be done on different, custom, parameters.
extraneon
@extraneon: I am suggesting introducing another class.
Tom Hawtin - tackline
+2  A: 

As an alternative to what Roman said you can have a look at this SO question about filtering using Predicates. If you use Google Collections anyway this might be a good fit.

extraneon
+1 - the GC library is also especially effective if you only need a filtered view.
Carl
+4  A: 

Overriding hashCode and equals in Class1 (just to do this) is problematic. You end up with your class having an unnatural definition of equality, which may turn out to be other for other current and future uses of the class.

Review the Comparator interface and write a Comparator<Class1> implementation to compare instances of your Class1 based on your criteria; e.g. based on those two attributes. Then instantiate a TreeSet<Class>` for duplicate detection using the TreeSet(Comparator) constructor.

EDIT

Comparing this approach with @Tom Hawtin's approach:

  • The two approaches use roughly comparable space overall. The treeset's internal nodes roughly balance the hashset's array and the wrappers that support the custom equals / hash methods.

  • The wrapper + hashset approach is O(N) in time (assuming good hashing) versus O(NlogN) for the treeset approach. So that is the way to go if the input list is likely to be large.

  • The treeset approach wins in terms of the lines of code that need to be written.

Stephen C
A: 

Sometimes programmers make things too complicated trying to use all the nice features of a language, and the answers to this question are an example. Overriding anything on the class is overkill. What you need is this:

class MyClass {
  Object attr1;
  Object attr2;
}

List<Class1> list;
Set<Class1> set=....
Set<MyClass> tempset = new HashSet<MyClass>;

for (Class1 c:list) {
  MyClass myc = new MyClass();
  myc.attr1 = c.attr1;
  myc.attr2 = c.attr2;

  if (!tempset.contains(myc)) {
    tempset.add(myc);
    set.add(c);
  }
}

Feel free to fix up minor irregulairites. There will be some issues depending on what you mean by equality for the attributes (and obvious changes if the attributes are primitive). Sometimes we need to write code, not just use the builtin libraries.

DJClayworth