views:

92

answers:

1

I'm looking for a hash function over sets H(.) and a relation R(.,.) such that if A is included in B then R(H(A), H(B)). Of course, R(.,.) must be easy to verify (constant time), and H(A) should be computed in linear time.

One example of H and R is:

  • H(A) = OR over 1 << (h(x) % k), for x in A, k a fixed integer and h(x) a hash function over integers.
  • R(H(A), H(B)) = ((H(A) & H(B)) == H(A))

Are there any other good examples? (good is hard to define but intuitively if R(H(A), H(B)) then whp A is included in B).

+3  A: 
  • After thinking about this, I ended up with the example you gave. I.e. each element in B sets a bit in the hash, and A is only contained in B if each bit which is set in H(A) is also set in H(B).
  • Maybe a Bloom filter is applicable in your case. It seems to use the same bit trick, but with multiple hash functions.
Sjoerd
I kept thinking today how to make use of Bloom filters. A small k is useful when you do extremal set searching because you can partition sets in 2^k sets and reduce the number of total tests. The idea I had was to choose k bits from the bloom filter.
Alexandru