tags:

views:

869

answers:

4

java.util.Set implementations removes the duplicate elements.

How are duplicates elements deleted internally in a java.util.Set??

+1  A: 

Read your question more detailed:

You can't add duplicates, from java doc for Set.add() or do you mean addAll?:

Adds the specified element to this set if it is not already present (optional operation). More formally, adds the specified element e to this set if the set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false. In combination with the restriction on constructors, this ensures that sets never contain duplicate elements.

Tomas
+3  A: 

Actually AFAIK from the sources most Set implementations in java don't even check if the element is already contained.

They just always execute the add() on their internal structure which holds the set elements and let that object handle the duplication case.

e.g. HashSet calls put(K,V) on the internal HashMap which just inserts the new object overwriting the old entry if duplicate.

jitter
How about explaining why you down vote me?
jitter
+2  A: 

Reading a little into your question I'm guessing that you're seeing strange behaviour with a java.util.HashSet (typically what everyone uses by default).

Contary to the contract of java.util.Set it is possible to get the same object in a java.util.HashSet twice like this:

import java.util.HashSet;
import java.util.Set;

public class SetTest 
{
  public static void main(String[] args) 
  {
    MyClass myObject = new MyClass(1, "testing 1 2 3");

    Set<MyClass> set = new HashSet<MyClass>();
    set.add(myObject);

    myObject.setHashCode(2);
    set.add(myObject);

    System.out.println(set.size());  // this will print 2.
  }

  private static class MyClass 
  {
    private int hashCode;
    private String otherField;

    public MyClass(int hashCode, String otherField) 
    {    
      this.hashCode = hashCode;
      this.otherField = otherField;
    }

    public void setHashCode(int hashCode) 
    {
      this.hashCode = hashCode;
    }

    public boolean equals(Object obj) 
    {    
      return obj != null && obj.getClass().equals(getClass()) && ((MyClass)obj).otherField.equals(otherField);
    }

    public int hashCode() 
    {
      return hashCode;
    }
  }
}

After the pointer from @jitter and a look at the source you can see why this would happen.

Like @jitter says, the java.util.HashSet uses a java.util.HashMap internally. When the hash changes between the first and second add a different bucket is used in the java.util.HashMap and the object is in the set twice.

The code sample may look a little contrieved but I've seen this happen in the wild with domain classes where the hash is created from mutable fields and the equals method hasn't been kept in sync with those fields.

Nick Holt
Modifying objects in a HashSet in a way that modifies their hashCode()/equals() results produces undefined behavior.
Joachim Sauer
@Joachim - exactly, but it doesn't mean it doesn't happen. Infact, the equals/hashCode methods generated by the popular IDEs often cause the hashCode to change as an object is mutated.
Nick Holt
Arguably the hashcode *should* change if the object is mutated - after all, it needs to be consistent to `equals()`, so it needs to change if an object would no longer be considered equal to its pre-mutation state. The real problem here is using mutable objects as HashMap keys; it's *highly* recommended to only use immutable objects, otherwise you open yourself up to this kind of randomness, as generally speaking `hashCode()` *must* change as a mutable object is mutated.
Andrzej Doyle
@dtsazza - the hashcode doesn't necessarily need to change when an object is mutated as equal hashcodes don't imply equality (and as the example above shows, changing the hash after instantiation can be dangerous).
Nick Holt
+2  A: 

An easy way to find this out is to look in the source for the code you are interested in.

Each JDK has a src.zip included which contains the source code for the public classes so you can just locate the source for HashSet and have a look :) I often use Eclipse for this. Start it, create a new Java project, set the JVM to be an installed JDK (if not you are using the system default JRE which doesn't have src.zip), and Ctrl-Shift-T to go to HashSet.

Thorbjørn Ravn Andersen