views:

53

answers:

3
+1  Q: 

Hashing a range

I have an array of range elements. Every element is has a start and an end. Within the array, the ranges are not overlapping, and they are sorted.

i.e. (code just to illustrate, do not expect it to compile):

var arr = { [0,3], [5,10], [15,59] };

Given a value (say 9), is there a hash function of the ranges that will allow me to quickly get the element that has the range that contains the value?

Of course there is the naive solution, just cycle though each element until you find the right one; the more elaborate one, like binary search by the start of the range; and the patented one of creating a binary tree with the ranges.

But does anybody knows of a way to use a hash?

+2  A: 

You could precompute the nearest neigbour in advance and store it somewhere. In your example the table has 0..59 entries and you store the index of the nearest range at every index.

That way it will be really fast.

schoetbi
I like this solution. Your lookup value then just becomes the index into your table and the table returns the index of arr[]. 2 lookups and you are done. It'll get memory intensive quickly, but it solves the problem nicely.
Michael Dorgan
Notice that in the example there are holes between ranges, there are no 0 to 59 entries. True that the holes can be replaced with nulls, but it defeats the ability to do it in ranges, memory consumption sky rocket. the ranges that I am seeing are between 0, and 2+ millions, with a lot of holes in between, and I need to keep 60k or more of these sets in memory (hence why ranges)This is why I am looking for a hashing solution. Hash the range in a way that I can answer the question: Is this position in a range (and which one), or it falls in a hole?
JorgeLeo
This does not use a hash (even if I doubt a hash is a good solution here).
Frank
+1  A: 

What about using a Dictionary<int, Element> to hold all numbers in all ranges, i. e. adding the four entries

0, [0,3]
1, [0,3]
2, [0,3]
3, [0,3]

and so on for the other ranges. It uses a hash by using the Dictionary, but I doubt it is the most efficient solution.

Frank
This defeat the purpose of dealing with ranges of values, instead of individual values. It will solve the problem, but it will create a bigger one in terms of memory
JorgeLeo
As I stated above, it uses hashing as required, but it is not very efficient.
Frank
A hash is a one way function that takes a piece if information and returns a value. What you have there is a lookup table, or a hash table if you must; because of memory constrains, I need a function. Thank you anyways.
JorgeLeo
A: 

You could create an index for the start values to index in arr.

var arr = { [0,3], [5,10], [15,59] };

var dictStart = new Dictionary<int, int>();
dictStart.Add(0, 0);
dictStart.Add(5, 1);
dictStart.Add(15, 2);

Now to get the range you can do a binary search on the values of the dictStart dict for the first element that is higher that the value to search for. Then take the previous entry.

Difficult to explain. As an example look up 9. Do the search what got element e=[5, 1].

var range = arr[e[1]];   // = [5, 10]
bool isWithin = val <= range[1] && val >= range[0];

So that way it will be less memory invasive. The key is to have a fast search fro the start values of the ranges.

I guess it will solve the problem but it is not a hash.

schoetbi
Not exactly what I was looking for, but it gave me an idea.Normalize the end of the ranges, then use the last 4 bits of each normalized start range to calculate the bucket. The hashing can get into the form of int[][], where the first array is the bucket, and the jagged is the list of ranges that belong to that bucket.Thank you
JorgeLeo