258

7
+7  Q:

An algorithm to solve a simple(?) array problem.

Hello! For this problem speed is pretty crucial. I've drawn a nice image to explain the problem better. The algorithm needs to calculate if edges of a rectangle continue within the confines of the canvas, will the edge intersect another rectangle?

We know:

1. The size of the canvas
2. The size of each rectangle
3. The position of each rectangle

The faster the solution is the better! I'm pretty stuck on this one and don't really know where to start.

Cheers

+2  A:

Lines that are not parallel to each other are going to intersect at some point. Calculate the slopes of each line and then determine what lines they won't intersect with.

Start with that, and then let's see how to optimize it. I'm not sure how your data is represented and I can't see your image.

Using slopes is a simple equality check which probably means you can take advantage of sorting the data. In fact, you can probably just create a set of distinct slopes. You'll have to figure out how to represent the data such that the two slopes of the same rectangle are not counted as intersecting.

EDIT: Wait.. how can two rectangles whose edges go to infinity not intersect? Rectangles are basically two lines that are perpendicular to each other. shouldn't that mean it always intersects with another if those lines are extended to infinity?

Sorry infinity is the wrong word, I mean 'carried on within confines of canvas'.That image sometimes doesn't show for me as well, try refreshing seems to bring it up most of the time.
+6  A:

Just create the set of intervals for each of the X and the Y axis. Then for each new rectangle, see if there are intersecting intervals in the X or the Y axis. See here for one way of implementing the interval sets.

In your first example, the interval set on the horizontal axis would be `{ [0-8], [0-8], [9-10] }`, and on the vertical: `{ [0-3], [4-6], [0-4] }`

This is only a sketch, I abstracted many details here (e.g. usually one would ask an interval set/tree "which intervals overlap this one", instead of "intersect this one", but nothing not doable).

Edit

Please watch this related MIT lecture (it's a bit long, but absolutely worths it). Even if you find simpler solutions (than implementing an augmented red-black tree), it's good to know the ideas behind these things.

I'm not sure an interval tree is worth the trouble if there's only a single query. I think you can do the same thing in `O(n log n)` with a sort, which is probably going to be faster in practice than interval trees.
+1, got the same algorithm. Took me ages to format it though..
+1. I think I just posted the same concept, just without efficiency =/
I like this solution but there is a lot there I'm going to have to read up on!
Thanks for the comments guys. Please do note that I only posted this minutes after the question, it's merely a sketch and nowhere near a complete solution, and application specific fine-tuning will be needed (e.g. IVlad's comment). I'm also updating the answer with a link I most warmly recommend.
+1  A:

as long as you didn't mention the language you chose to solve the problem, i will use some kind of pseudo code

the idea is that if everything is ok, then a sorted collection of rectangle edges along one axis should be a sequence of non-overlapping intervals.

1. number all your rectangles, assigning them individual ids
2. create an empty binary tree collection (btc). this collection should have a method to insert an integer node with info btc::insert(key, value)
3. for all rectangles, do:
``````
foreach rect in rects do
btc.insert(rect.top, rect.id)
btc.insert(rect.bottom, rect.id)
``````
1. now iterate through the btc (this will give you a sorted order)
``````
btc_item = btc.first()
do
id = btc_item.id
btc_item = btc.next()
if(id != btc_item.id)
then report_invalid_placement(id, btc_item.id)
btc_item = btc.next()
while btc_item is valid
``````

5,7,8 - repeat steps 2,3,4 for rect.left and rect.right coordinates

Why use a bst when you can just sort them with your favorite sorting algorithm? I think a tree should only be used if there are multiple queries such as "insert a new rectangle" and "do any edges potentially intersect?". And even then it should be an interval tree, not a bst, otherwise you have `O(n)` query time.
the idea of using a tree is that it is sorted upon inserting new values. with a separate sort you will need to 1) create an additional container and 2)sort it. insertion into a bst is an o(log n) operation
1) the bst also needs an additional container, sorting might actually not need it. 2) sorting is `O(n log n)`, inserting `n` items in a bst is also `O(n log n)`. 3) a balanced tree is much slower in practice than a sorting algorithm, it's only useful for multiple queries and 4) each bst query is `o(n)` in your implementation, so a bst is no better than sorting even for multiple queries. An interval tree might be, but again, only for multiple queries.
ok, sure both ways need an additional container, the question is how many times you will need to iterate over the sequence of rectangles. in case of sort, you first fill the container o(n), then sort the container o(n log n). in case of a tree you just fill the tree o(n log n).
not at all, iterating over a tree with an iterator is an o(1) operation
@ULysses - so are you saying the `do .. while` loop in your code is `O(1)`?
no, it's `o(n)`. quote from your comment: `4) each bst query is o(n) in your implementation,`. i understood this in a way that you consider bst.next() as o(n), which is wrong. in case of using sort you will still need to iterate over the sorted vector, which will give you the same o(n)
@ULysses - by query I mean finding if there's an intersection. Unless there are multiple queries, a balanced tree will be much slower than sorting.
IVlad, i think we have just discussed this, and again you say that it is slower. No, it isn't, look at my previous comments to know why. Using a tree in this scenario may look strange to you, but that doesn't mean it is inefficient.
@ULysses - just because you **might** need to use another container and first copy data to it doesn't mean sorting is slower, and you might be able to just sort the original container in place. You can also copy while you sort, which means only extra memory, which is also true for the bst. Balanced search trees are much slower than sorting, go ahead and test it if you think trees are some kind of silver bullet. They're not, they only pay off when you have a lot of queries, sorting is otherwise preferred.
+1  A:

I like this question. Here is my try to get on it:

If possible: Create a polygon from each rectangle. Treat each edge as an line of maximum length that must be clipped. Use a clipping algorithm to check weather or not a line intersects with another. For example this one: Line Clipping

But keep in mind: If you find an intersection which is at the vertex position, its a valid one.

+1  A:

Here's an idea. Instead of creating each rectangle with `(x, y, width, height)`, instantiate them with `(x1, y1, x2, y2)`, or at least have it interpret these values given the width and height.

That way, you can check which rectangles have a similar `x` or `y` value and make sure the corresponding rectangle has the same secondary value.

Example:

The rectangles you have given have the following values:

• Square 1: [0, 0, 8, 3]
• Square 3: [0, 4, 8, 6]
• Square 4: [9, 0, 10, 4]

First, we compare `Square 1` to `Square 3` (no collision):

• Compare the x values
• [0, 8] to [0, 8] These are exactly the same, so there's no crossover.
• Compare the y values
• [0, 4] to [3, 6] None of these numbers are similar, so they're not a factor

Next, we compare `Square 3` to `Square 4` (collision):

• Compare the x values
• [0, 8] to [9, 10] None of these numbers are similar, so they're not a factor
• Compare the y values
• [4, 6] to [0, 4] The rectangles have the number 4 in common, but 0 != 6, therefore, there is a collision

By know we know that a collision will occur, so the method will end, but lets evaluate `Square 1` and `Square 4` for some extra clarity.

• Compare the x values
• [0, 8] to [9, 10] None of these numbers are similar, so they're not a factor
• Compare the y values
• [0, 3] to [0, 4] The rectangles have the number 0 in common, but 3 != 4, therefore, there is a collision

Let me know if you need any extra details :)

That looks like a great solution, I'll test it out!
this approach will result in comparing each rectangle to each rectangle. This is not efficient since it yields n(n-1)/2 rectangle comparisons, which is an o(n^2) comparisons of all four coordinates
@ULysses: I mentioned in an earlier comment in this thread that my concept is not built for efficiency, true. My goal was to present this concept in the easiest way possible, which sometimes involves a slow step-by-step methodology. You could very well perform something like Dimitris mentioned in his answer, which would be a lot less costly.
A:

you should take a lookt at the "SAT" "Separating Axis Theroem"

A:

Heh, taking the overlapping intervals answer to the extreme, you simply determine all distinct intervals along the x and y axis. For each cutting line, do an upper bound search along the axis it will cut based on the interval's starting value. If you don't find an interval or the interval does not intersect the line, then it's a valid line.

The slightly tricky part is to realize that valid cutting lines will not intersect a rectangle's bounds along an axis, so you can combine overlapping intervals into a single interval. You end up with a simple sorted array (which you fill in O(n) time) and a O(log n) search for each cutting line.