views:

253

answers:

5

Hi,

I'll be needing to use Hash Tables with different keys. One as string for the key and another as integer.

As for the integer one, how stupid it is to run a hash function on the number to generate a key?

I mean, the numbers I'll be using as key for the Hash Table will always be different, there will be no duplicates at all. Isn't enough for me to use the mod operator to "truncate" the value below the hash table size?

Or there's something more to it?

A: 

It's not stupid, it is perfectly sensible. An integer is a natural seed in a unique naming scheme. Allthough the relational zealot in me dies a little when I say things like this =D

Hassan Syed
+1  A: 

In my opinion it's not stupid. It may not be the best option if you tend to have relatively few values (in which case using a plain array could be better).

I'd use the modulo operator to hash integers to the hash size.

Romain
An array isn't a general purpose answer to this question, what happens if the guys has a few numbers arround 100 and then one value such as 5000 -- thats quite a big hole.
Hassan Syed
@Hassan, you can use sparce arrays in that case.
Romain
@Hassan, he never said or implied that an array would be a general-purpose answer to this question. One answer may be right most of the time, but it's important to recognize that the "right" answer depends on the specific conditions.
StriplingWarrior
Thank you for clarifying stripling -- You have made it much more clear to me than Romain did -- I will go and medidate on your advice now.
Hassan Syed
A: 

what about in case of integer to use sorted array and do binary search? actually same for string, but for string hashing may be cheaper

Andrey
Binary search is O(log n), hash table access is amortized O(1). Yes I know it's very unlikely that log n > 32...
KennyTM
key word is "amortized". you need to consider size of hashtable you plan to use then number of items you plan to store and hash function. i still prefer binary search, because it is easy and O(log n) is good speed.
Andrey
I would hesitate to suggest using a sorted array until I know know whether and how often he expects to add/remove values, which would be an O(n) function.
StriplingWarrior
+3  A: 

It is fine unless there's a high chance that your integer keys are 62, 93, 124, ... and your hash table size happens to be 31.

See http://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key if you worry about this.

KennyTM
+2  A: 

As with so many design questions in our field, the answer is: "It depends." There are certain, specific cases where it would be stupid to bother running a typical hash algorithm on the integers. If you know based on your specific situation that a modulus would distribute the expected data evenly, and if performance is very important to you, and if you'll need to access this hashtable quite a lot, then it would be stupid. Barring these conditions, there are a number of very good reasons to just use a generic hashing algorithm that will work well across a variety of circumstances. In the vast majority of cases, it would be stupid to do otherwise. In some cases, using a hashtable would be a stupid choice in the first place.

If you gave us more information about the type of data you're storing, why you're storing it, and how important performance is to you, we might be able to point you to a solution that works better than using a hashtable. Frameworks like Java and .NET have hash functions that are fast and avoid hashing numbers to the same bucket. In most cases, I'd trust the default hash method.

StriplingWarrior
That's not my question...
Nazgulled
I'm sorry that I didn't answer the question to your satisfaction. When people ask a very generic question without providing enough background information, the obvious, short answer is often not actually right in some cases. I was trying to point this out, while at the same time implying that for most cases it isn't stupid to use a standard hash algorithm. I just wasn't direct enough to make it obvious that I actually was answering your question. Hopefully my new opening paragraph helps to clarify my answer somewhat.
StriplingWarrior