tags:

views:

56

answers:

4

Suppose I have a simple DTO object right out of the database, and the Id is a recordId that is definitely unique, is it a good idea then to do the following ?

public class DTO
{
    public int Id { get; set; }

    public override bool Equals(object obj)
    {
        return (Id == ((DTO)obj).Id);

    }

    public override int GetHashCode()
    {
        return Id;
    }

}

The reason I doubt it a bit is because I don't see it in code around me, as oposed to code like

    int hash = 7;
    hash = 89 * hash + pageId.hashCode();
    hash = 89 * hash + recordId;
    return hash;
+4  A: 

The contract for a hash code is "two equal objects must have the same hash code". That implies that any fields used in determining equality must be represented in the bits that make up the hash code. Since your equality contract refers only to the ID, then that's the only thing required in the hash code.

Jonathan Feinberg
That implies that any fields used in determining equality must be represented in the bits that make up the hash code. --> IMO that is not true, it is probably best, but not necessary to honour the contract. If your hashcode is return 23; for example, two equal objects will have the same hashcode to
Peter
Your comment doesn't make any sense to me. You say "it's not necessary to honor the contract", then you give an example of a hashcode function returning a constant as an "example" of something. But what is that an example of? It trivially honors the contract, though is a crappy hash function for other reasons. And yes, it is true that two equal objects *must* have the same hash code; that's not subject to opinion. (Otherwise two equal objects can wind up in the same set or as keys in the same map.)
Jonathan Feinberg
You misquote me , Please read more carefully : 1. It is necessary to honour the contract of course. 2. two equal objects must have the same hash code , of course again, I am not stating otherwise 3.I only say that using the fields that make up equality do not need to be used in order to honouring the contract. That is simply wrong, I am sorry. To answer your question "What is that an example of" : it's an exmaple of a crappy hashfunction, but prove of the fact that a hashfunction do not need to use the fields used in equality in order to honour the contract.
Peter
By the way, thanks for Wordle, a really nice app!
Peter
I interpreted your statement "IMO that is not true, it is probably best, but not necessary to honour the contract." as meaning "it is not necessary to honor the contract," which is, I think, a reasonable interpretation of that sentence (although confusing!). I see now that you meant that "using all fields involved in equality checks in the hash function is not necessary *in order to* honor the contract." So now I also understand your "example"!
Jonathan Feinberg
+2  A: 

A good hash function is supposed to (more or less) randomly distribute the hash values, so that when you put the hash values into a binary tree, you get a good, evenly distributed tree, and not one that is just a linked list down one side.

See here: http://blogs.sun.com/kah/entry/the_importance_of_good_hash

But if you never have this need (i.e. you will always be returning records from the database, rather than looking them up from your own binary tree), then using the id as the hash seems perfectly reasonable to me.

Robert Harvey
IMO Using just the int is taking care of a perfect(100%) distribution, and why hash an int ? I believe it is just returning the int again for the same reason of perfect distribution.
Peter
It may be perfect distribution if you always load all entities, but if you load only subsets, it's likely that the subset does not have evenly distributed ids.
iammichael
See the article. The only issue is that putting the records of a table into a dictionary (or other binary-search structure) would skew the index, if they were put into the dictionary *in numerical order.*
Robert Harvey
@iammichal : I don't believe I get that, each bucket has only one element per definition, no matter the subset
Peter
The solution is to use a data structure that is self-balancing (such as a red-black tree), or put the records into the structure in random order. But it's all academic if you don't have a need for this. I only point it out because you asked, and it's the only reason I can think of why an int as a hash might not be a good idea.
Robert Harvey
@robert : I have read the article indeed, the copy pasted example in the question is taken from it
Peter
@Robert, I like that pointed out, thanks, I asked indeed, I think discussion around a question/answer is often as interesting as just the straigth answer
Peter
+1  A: 

Since the int already has a method for getting the hash code, I would just use that one.

public override int GetHashCode()
{
    return Id.GetHashCode();
}
David Basarab
Why would you do that for?, anyway isn't that just returning the int again?
Peter
Hmm, so it does.
Robert Harvey
+1  A: 

If your class only contains an integer, you can use that as hash code. That's the same as the implementation of the Int32.GetHashCode method that just returns the integer itself.

Guffa
Indeed, I thought so, see previous comments,tx
Peter
Since it is a DTO, there would be other fields present as well, but the database would guarantee that the int is unique.
Robert Harvey
@Robert : indeed
Peter