ansaurus

Question

Why can't I retrieve an item from a HashSet without enumeration?

Answer 1

+2 A:

If you change an object after it has been inserted, it's hash may have changed (this is especially likely if hashCode() has been overridden). If the hash changes, a lookup of it in the set will fail, as you will be attempting to lookup an object that is hashed at a different location than it is stored in.

Also, you need to make sure you have overridden hashCode and equals in your object if you want to lookup equal objects that are different instances.

Note that this is all for Java - I am assuming C# has something similar, but as it has been several years since I used C#, I will let others speak to it's capabilities.

aperkins 2009-09-29 20:46:24

This is true and all, but it is perfectly possible at the moment to break HashSet (and all components with this invariant) by hanging onto a reference after you add it to the set. You can use that reference to mutate and invalidate the invariant as you wish. The API already relies on the user to fulfill this contract, returning the object reference would not change this.

Soonil 2009-09-29 20:56:35

Answer 2

A:

Set objects in those languages were mostly designed as set of value, not for mutable objects. They check that object put in them are unique by using equals. That is why contains and remove returns boolean, not the object: they check for or remove the value you pass to them.

And actually, if you do a contains(X) on a set, and expect to get a different object Y, that would means X and Y are equals (ie X.equals(Y) => true), but somewhat different, which seems wrong.

penpen 2009-09-29 20:55:25

Sets are unique according to the comparison method you specify. Simply because I wish to use comparison method A in my set does not mean comparison method B has no value do me when I consider the same objects. Addressing your comments on mutability, see my answer to aperkins.

Soonil 2009-09-29 21:01:04

Answer 3

+3 A:

How were you proposing to retrieve the item from the hash set? A set is by definition not ordered in any way and therefore, there is no index with which to use to retrieve the object in question.

Sets, as a concept, are used to test inclusion, i.e. whether or not the element in question is in the hash data set. If you're looking to retrieve a value from a data source using a key value or index, I would suggest looking into either a Map or a List.

EDIT: Additional answer based on the Edit to the original question

Soonil, based on your new information, it looks like you might be interested in implementing your data as a Java Enum, something similar to this:

 public enum SoonilsDataType {
      A, B, C;

      // Just an example of what's possible
      public static SoonilsDataType getCompositeValue(SoonilsDataType item1,
           SoonilsDataType item2) {
           if (item1.equals(A) && 
                     item2.equals(B)) {
                return C;
           }
      }
 }

Enum's automatically inherit values() which returns the list of all values in the enum's "set", which you can use to test inclusion against in the same way as the Set. Also, because its a full class, you can define new static methods to do the composite logic (like I was trying to allude to in the example code). The only thing about the Enum is that you can't add new instances at runtime, which may not be what you want (though if the set's data size isn't going to grow at runtime, the Enum is what you want).

Peter Nix 2009-09-29 20:57:25

See my comment to penpen for your answer.

Soonil 2009-09-29 21:02:08

@Soonil If I understand your comment to penpen, you're using not only object equality (for existence in the Set) but also a identifier (or some other numerical index) of your own devising as a secondary equals and you want this identifier to also be used as a index into the hash set?If this is correct, you seem to be working orthogonal to the nature of hash set and I once again suggest you look into a Map (probably HashMap) as your data source.

Peter Nix 2009-09-29 21:08:31

Sure Peter, I've added some details to the end of my question, let me know if that helps.

Soonil 2009-09-29 21:19:18

Peter, TBH, I don't entirely understand what you are getting at with the enums, but I think our understanding has diverged somewhere :) +1 for the effort though!

Soonil 2009-09-29 21:41:35

Answer 4

+3 A:

I imagine the designers of the Set interface and HashSet class wanted to ensure that the remove(Object) method defined on the Collection interface was also applicable to Set; this method returns a boolean denoting whether the object was successfully removed. If the designers wanted to provide functionality whereby remove(Object) returned the "equal" object already in the Set this would mean a different method signature.

Also, given that the object being removed is logically equal to the object passed to remove(Object) it is arguable about the value added in returning the contained object. However, I have had this problem myself before and have used a Map to solve the problem.

Note that in Java, a HashSet uses a HashMap internally and so there isn't additional storage overhead in using a HashMap instead.

Adamski 2009-09-29 21:01:21

You are correct about Java :) alas, C# does not use a HashMap to implement a HashSet, and there are some space/time benefits I would like to keep if possible.

Soonil 2009-09-29 21:03:28

Answer 5

+1 A:

Looks to me like you're actually looking for a Map<X,Y>, where Y is the type of extra1.

(rant below)

The equals and hashCode methods define meaningful object equality. The HashSet class assumes that if two objects are equal as defined by Object.equals(Object) there is no difference between these two objects.

I'd go as far as to say that if the object extra is meaningful, your design is not ideal.

Robert Munteanu 2009-09-30 17:54:38

Agreed with your rant :) The map would work, but I've decided against the extra overhead and will probably roll my own set (since this isn't production code, heh) +1

Soonil 2009-09-30 20:24:06

Answer 6

A:

I was given an interesting suggestion as to a way to use a Map, by having my own objects define themselves as KeyValuePairs. While a good concept, unfortunately KeyValuePair is not an interface (why not?) and is a struct, which shoots that plan out of the air. In the end I will roll my own Set, as my constraints allow me this option.

Soonil 2009-10-01 18:54:07

ansaurus

tags:

views:

answers:

Why can't I retrieve an item from a HashSet without enumeration?

related questions