views:

369

answers:

7

I have a number of one dimension arrays of Object[] (these objects are primitive types if it helps)

I want to store these arrays in a List, but only the arrays whose contents are unique from the rest.

My first aproximation was to iterate througth the arrays storing in a Set the value of Arrays.hashCode(array) and only storing the array in the desired list if the value was not cointained in the set.

But later i realize that two arrays with different contents can produce the same hashcode (not frequently i hope)

can anyone help?

can i expect very frecuently hashcode collisions (same hascode from different contents)?

+1  A: 

If the hash code is the same, then you simply further check its detail.

NawaMan
interesting, you mean the use of a hashmap of hashcode/list of arrays. When i found an existing hascode i iterate throught the list of arrays calling Arrays.equals(array1,array2) and if i dont find duplicates, store it...it can work, tomorrow i try it and the accept the answer
Telcontar
+2  A: 

Is the problem that you'll have arrayX and arrayY, both with contents [a,b,c] but the Set doesn't treat them as equal? Would [a,b,c] and [c,a,b] be considered equal, or not?

I'd say define a comparator that defines "equality" for arrays exactly how you need it defined, and then insert each array into a Set that uses the custom comparator that you created.

Brian Schroth
+1  A: 

Following assumes that you consider arrays {1,2,3} and {3,2,1} not to be duplicates.

Do not store hashcode of arrays to the Set, store whole lists to to Set.

Convert your arrays to Lists. Lists have consistent equals and hashCode methods. Two lists are defined to be equal if they contain the same elements in the same order, and the List's hashCode will be consistent with equals method.

  List<Object> list = Arrays.asList(array);

Here is the whole algorithm. (Untested code but should work).

Set<List<Object>> findUniqueLists(List<List<Object>> allLists) {
   Set<List<Object>> uniqueSet = new LinkedHashSet<List<Object>>();
   uniqueSet.addAll(allLists);

   Set<List<Object>> processedSet = new LinkedHashSet<List<Object>>();

   for(List<Object> list : allLists) {
       if(processedSet.contains(list)) {
           // duplicate found!
           uniqueSet.remove(list);
       } else {
           // no duplicate
           processedSet.add(list)
       }
    }
    return uniqueSet;
}
Juha Syrjälä
he defines "same elements in different orders" as still being equal, so "Two lists are defined to be equal if they contain the same elements in the same order" is not sufficient.
Brian Schroth
+2  A: 

It sounds like you need a LinkedHashSet (preserving insertion order while maintaining uniqueness) and then wrap your arrays in an object that implements hashcode and equal in a way that makes sense for your arrays. A first approximation may just be the Arrays.asList() method, but you state in your question that you are using primitives in an Object[] array. Either you are relying on autoboxing, or you are in fact not using an Object[] array but rather an int[], long[], float[] as needed. The Arrays.asList() won't work correctly with those types.

Edit: Per the request of the comment, here is code for a wrapper class:

  public class ArrayWrapper { 
       private Object[]array; 
       public ArrayWrapper(Object[] array) { this.array = array; } 
       public Object[] getArray() { 
                 Object[] newArray=new Object[array.length]; 
                 System.arraycopy(array,0,newArray,0,array.length); 
                  return newArray; 
       } 
       public int hashCode() { return Arrays.hashCode(array); } 
       public boolean equals(Object obj) { 
              boolean b=false;
              if(obj instanceof ArrayWrapper){ 
                     b=Arrays.equals(this.array,((ArrayWrapper)obj).getArray()); 
              } 
              return b; 
       } 
 }
Yishai
they are primitive wrapm types, Integer, Float, String...
Telcontar
I finally use a wrapper class like you suggest, i post the code in a comment below, can you edit your answer to include the code?
Telcontar
public class ArrayWrapper { private Object[]array; public ArrayWrapper(Object[] array) { this.array = array; } public Object[] getArray() { Object[] newArray=new Object[array.length]; System.arraycopy(array,0,newArray,0,array.length); return newArray; } public int hashCode() { return Arrays.hashCode(array); } public boolean equals(Object obj) { boolean b=false; if(obj instanceof ArrayWrapper){ b=Arrays.equals(this.array,((ArrayWrapper)obj).getArray()); } return b; }}
Telcontar
Note on method getArray: i return a copy of the array to make the contents and hashcode of ArrayWrapper invariant. Contents in the arrays are not supposed to change in the application
Telcontar
A: 

For comparing efficiently, one uses sometimes a two steps approach:

  1. hashCode discards many potential matches
  2. if two hashCode are equals, the objects themselves are tested for equality (depends on their method equals)


About your Object[] being of primitive types, please remember the following:
To add a primitive type into the Object[], it will always be boxed/unboxed.
So you don't really have primitive types as the contents of your arrays.

To keep the primitive types, the arrays themselves have to be of primitive types, such as int[].

KLE
+1  A: 

Try something like this:

EDIT

Running and working code below:

bash-3.2$ cat ArraysTest.java 
import java.util.*;
public class ArraysTest {
    public static void main( String [] args ) {
        Set<Integer[]> set = new TreeSet<Integer[]>( new Comparator<Integer[]>() {
            public int compare( Integer[] one, Integer[] two ) {
                if( Arrays.equals( one, two ) )  {
                    return 0;
                }
                return Arrays.hashCode( one ) - Arrays.hashCode( two );
            }
            public boolean equals( Object o ){ return false; }
        });

        set.add( new Integer[]{1,2,3});
        set.add( new Integer[]{1,2,3});
        set.add( new Integer[]{3,2,1});

        for( Integer[] i : set ) {
            System.out.println( Arrays.asList( i ) );
        }

    }
}

bash-3.2$ javac ArraysTest.java  
bash-3.2$ java ArraysTest
[1, 2, 3]
[3, 2, 1]
bash-3.2$

You'll have to work a bit for make it work, this is just a sample, not actual running code.

As you know the Set only accept one element, and creating the TreeSet with a custom comparator allow you do tell the set what is equals for you.

Arrays.equals() methods describes:

..two arrays are equal if they contain the same elements in the same order...

OscarRyz