ansaurus

Question

Answer 1

A:

I think all that's needed is a key class where hashCode() and equals() do what you want them to do. I suspect that you might encounter a problem where A overlaps B (i.e. A.equals(B) == true), B overlaps C, but C doesn't overlap A. If you implement such an equals() method, you'll probably get strange behaviour.

Basically, you want to do something like stabbing queries on a Segment Tree (i.e. for all overlapping intervals E for an interval (p1.start, p1.end), perform stabbing queries for p1.start and p1.end).

But basically, no, I don't know a correct answer to your question. But maybe a query for "Segment tree" hadoop will get you started.

sfussenegger 2009-12-02 10:56:01

Answer 2

A:

How do you expect the output to look like? That would be helpful :-)

Peter Wippermann 2009-12-08 15:29:01

I expect the same as the following SQL query "select A.start,A.end,B.start,B.end from A,B where NOT(A.end<B.start or B.end< A.start);

Pierre 2009-12-08 17:37:45

ansaurus

tags:

views:

answers:

Hadoop: intervals and JOIN

related questions