ansaurus

Question

Moving from Linear Probing to Quadratic Probing (hash collisons)

Answer 1

+1 A:

You don't have to modify the hash function for quadratic probing. The simplest form of quadratic probing is really just adding consequent squares to the calculated position instead of linear 1, 2, 3.

There's a good resource here. The following is taken from there. This is the simplest form of quadratic probing when the simple polynomial c(i) = i^2 is used:

alt text

In the more general case the formula is:

alt text

And you can pick your constants.

Keep, in mind, however, that quadratic probing is useful only in certain cases. As the Wikipedia entry states:

Quadratic probing provides good memory caching because it preserves some locality of reference; however, linear probing has greater locality and, thus, better cache performance. Quadratic probing better avoids the clustering problem that can occur with linear probing, although it is not immune.

EDIT: Like many things in computer science, the exact constants and polynomials of quadratic probing are heuristic. Yes, the simplest form is i^2, but you may choose any other polynomial. Wikipedia gives the example with h(k,i) = (h(k) + i + i^2)(mod m).

Therefore, it is difficult to answer your "why" question. The only "why" here is why do you need quadratic probing at all? Having problems with other forms of probing and getting a clustered table? Or is it just a homework assignment, or self-learning?

Keep in mind that by far the most common collision resolution technique for hash tables is either chaining or linear probing. Quadratic probing is a heuristic option available for special cases, and unless you know what you're doing very well, I wouldn't recommend using it.

Eli Bendersky 2010-02-27 17:10:09

Sorry but math formulas doesn't help me. :( And you didn't give me more than what I've already read about it.

Nazgulled 2010-02-27 17:45:03

@Nazgulled: I really don't see what you're having trouble with - and as you got no other answers, maybe I'm not the only one. I think you should try to elaborate your question and rephrase it to explain exactly what you need

Eli Bendersky 2010-02-27 18:41:38

I look at the math formulas and I don't understand them and I also don't know what to do in code. I need to know what to do in words, not math formulas.

Nazgulled 2010-02-27 19:57:52

I just edited the main question, please take a look...

Nazgulled 2010-02-28 00:23:46

@Nazgulled: I've added to my answer. I hope you realize that in order to program successfully, you *must* have an understanding of formulas, or at least the will to look at them

Eli Bendersky 2010-02-28 03:39:24

This is for self-learning porpuses. I'll have a project where I'll have to use hash tables (probably) but for that I'll use chaining, I just want to learn the different ways to do it so I can write about them in the final report and argue why I picked one versus the other.For Linear probing I stopped looking when I visited every bucket on the hash table, but for quadratic probing, when should I stop looking?

Nazgulled 2010-02-28 14:53:34

@Nazgulled: the page I linked to has a discussion - as long as the size of the table is prime, the first M/2 (M is the size) samples are unique. Read up about it there, or any other online resource on algorithms

Eli Bendersky 2010-02-28 15:24:18

Answer 2

+1 A:

There is a particularly simple and elegant way to implement quadratic probing if your table size is a power of 2:

step = 1;

do {
    if(/* CHECK IF IT'S THE ELEMENT WE WANT */) {
        // FOUND ELEMENT

        return;
    } else {
        index = (index + step) % table_size;
        step++;
    }
} while(/* LOOP UNTIL IT'S NECESSARY */);

Instead of looking at offsets 0, 1, 2, 3, 4... from the original index, this will look at offsets 0, 1, 3, 6, 10... (the i^th probe is at offset (i*(i+1))/2, i.e. it's quadratic).

This is guaranteed to hit every position in the hash table (so you are guaranteed to find an empty bucket if there is one) provided the table size is a power of 2.

Here is a sketch of a proof:

Given a table size of n, we want to show that we will get n distinct values of (i*(i+1))/2 (mod n) with i = 0 ... n-1.
We can prove this by contradiction. Assume that there are fewer than n distinct values: if so, there must be at least two distinct integer values for i in the range [0, n-1] such that (i*(i+1))/2 (mod n) is the same. Call these p and q, where p < q.
i.e. (p * (p+1)) / 2 = (q * (q+1)) / 2 (mod n)
=> (p² + p) / 2 = (q² + q) / 2 (mod n)
=> p² + p = q² + q (mod 2n)
=> q² - p² + q - p = 0 (mod 2n)
Factorise => (q - p) (p + q + 1) = 0 (mod 2n)
(q - p) = 0 is the trivial case p = q.
(p + q + 1) = 0 (mod 2n) is impossible: our values of p and q are in the range [0, n-1], and q > p, so (p + q + 1) must be in the range [2, 2n-2].
As we are working modulo 2n, we must also deal with the tricky case where both factors are non-zero, but multiply to give 0 (mod 2n):
- Observe that the difference between the two factors (q - p) and (p + q + 1) is (2p + 1), which is an odd number - so one of the factors must be even, and the other must be odd.
- (q - p) (p + q + 1) = 0 (mod 2n) => (q - p) (p + q + 1) is divisible by 2n. If n (and hence 2n) is a power of 2, this requires the even factor to be a multiple of 2n (because all of the prime factors of 2n are 2, whereas none of the prime factors of our odd factor are).
- But (q - p) has a maximum value of n-1, and (p + q + 1) has a maximum value of 2n-2 (as seen in step 9), so neither can be a multiple of 2n.
- So this case is impossible as well.
Therefore the assumption that there are fewer than n distinct values (in step 2) must be false.

(If the table size is not a power of 2, this falls apart at step 10.)

Matthew Slattery 2010-02-28 02:05:00

I'm using a prime number instead...

Nazgulled 2010-02-28 14:49:12

ansaurus

tags:

views:

answers:

Moving from Linear Probing to Quadratic Probing (hash collisons)

related questions