views:

2685

answers:

9

Im looking for an algorithm to be used in a racing game Im making. The map/level/track is randomly generated so I need to find two locations, start and goal, that makes use of the most of the map.

  • The algorithm is to work inside a two dimensional space
  • From each point, one can only traverse to the next point in four directions; up, down, left, right
  • Points can only be either blocked or nonblocked, only nonblocked points can be traversed

Regarding the calculation of distance, it should not be the "bird path" for a lack of a better word. The path between A and B should be longer if there is a wall (or other blocking area) between them.

Im unsure on where to start, comments are very welcome and proposed solutions are preferred in pseudo code.

Edit: Right. After looking through gs's code I gave it another shot. Instead of python, I this time wrote it in C++. But still, even after reading up on Dijkstras algorithm, the floodfill and Hosam Alys solution, I fail to spot any crucial difference. My code still works, but not as fast as you seem to be getting yours to run. Full source is on pastie. The only interesting lines (I guess) is the Dijkstra variant itself on lines 78-118.

But speed is not the main issue here. I would really appreciate the help if someone would be kind enough to point out the differences in the algorithms.

  • In Hosam Alys algorithm, is the only difference that he scans from the borders instead of every node?
  • In Dijkstras you keep track and overwrite the distance walked, but not in floodfill, but thats about it?
A: 

Your description sounds to me like a maze routing problem. Check out the Lee Algorithm. Books about place-and-route problems in VLSI design may help you - Sherwani's "Algorithms for VLSI Physical Design Automation" is good, and you may find VLSI Physical Design Automation by Sait and Youssef useful (and cheaper in its Google version...)

Yuval F
Doesnt maze routing require that both start and target is defined? Or do you suggest that I do a Lee Algorithm for every node to every other node similar to the proposed Dijkstra's solution?
mizipzor
This would be too slow for you're problem. This algorithm finds the way, but it's not optimized to find the distance.
Georg
Maze routing requires that a start and target(s) are defined. Its strong point is that it does consider obstacles. I suggest you use heuristics - start with point pairs with maximal "birds-fly" distance, and then run Lee's algorithm for these pairs.
Yuval F
+3  A: 

Assuming the map is rectangular, you can loop over all border points, and start a flood fill to find the most distant point from the starting point:

bestSolution = { start: (0,0), end: (0,0), distance: 0 };
for each point p on the border
    flood-fill all points in the map to find the most distant point
    if newDistance > bestSolution.distance
        bestSolution = { p, distantP, newDistance }
    end if
end loop

I guess this would be in O(n^2). If I am not mistaken, it's (L+W) * 2 * (L*W) * 4, where L is the length and W is the width of the map, (L+W) * 2 represents the number of border points over the perimeter, (L*W) is the number of points, and 4 is the assumption that flood-fill would access a point a maximum of 4 times (from all directions). Since n is equivalent to the number of points, this is equivalent to (L + W) * 8 * n, which should be better than O(n2). (If the map is square, the order would be O(16n1.5).)

Update: as per the comments, since the map is more of a maze (than one with simple obstacles as I was thinking initially), you could make the same logic above, but checking all points in the map (as opposed to points on the border only). This should be in order of O(4n2), which is still better than both F-W and Dijkstra's.

Note: Flood filling is more suitable for this problem, since all vertices are directly connected through only 4 borders. A breadth first traversal of the map can yield results relatively quickly (in just O(n)). I am assuming that each point may be checked in the flood fill from each of its 4 neighbors, thus the coefficient in the formulas above.

Update 2: I am thankful for all the positive feedback I have received regarding this algorithm. Special thanks to @gs for his review.

P.S. Any comments or corrections are welcome.

Hosam Aly
It's not guaranteed that all border points belong to the longest distance. It's likely, but depending on the maze-generator there could be other solutions.
Georg
You're right @gs. I didn't think of it as a "maze", but rather a simpler map (with obstacles). But if the algorithm is modified to check each point (rather than just border points), wouldn't this be O(n^2) (compared with FW's O(n^3))?
Hosam Aly
The map will not have as many obstacles as it will have complex edges. The edges themselves will rarely be straight.
mizipzor
You could consider the "edges" are what I call "obstacles". It shouldn't really make a difference. However, your clarification reveals that it could be a maze, so I hope someone will answer my question to @gs soon.
Hosam Aly
I'm unsure… the flood fill is an optimized Dijkstra. I'll try to implement Floyd-Warshall and your method and try it. Expect results in an hour or so.
Georg
Thanks @gs. I'm at work now so I can't really try it. I'm looking forward to your results.
Hosam Aly
@gs: Nice. Working on a Dijkstra variant right now. Couldnt understand how to implement the F-W. Looking forward to a report from you. :)
mizipzor
@mizipzor: Implementing F-W should be fairly simple. for (int i = 0; i < nPoints; ++i) for (int j = 0; j < nPoints; ++j) for (int k = 0; k < nPoints; ++k) distance[i][j] = min(distance[i][j], distance[i][k] + distance[k][j]);
Hosam Aly
You can see my results in the answer. As you said, it's about n^2.
Georg
Please elaborate your flood fill, I'm not sure it's clear for everyone.
Georg
I will, just give me a few minutes to test something that may make the comparison more interesting. Thanks a lot @gs! :)
Hosam Aly
Just got time to implement this algorithm fully. On a 17x34 tiles map with a fairly complex shape, the algorithm found the two points in just over three seconds.
mizipzor
@mizipzor, I'm glad you chose my algorithm. Could you please share your map (and algorithm implementation if possible)? I may be able to optimize it further.
Hosam Aly
A: 

If your objects (points) do not move frequently you can perform such a calculation in a much shorter than O(n^3) time.

All you need is to break the space into large grids and pre-calculate the inter-grid distance. Then selecting point pairs that occupy most distant grids is a matter of simple table lookup. In the average case you will need to pair-wise check only a small set of objects.

This solution works if the distance metrics are continuous. Thus if, for example there are many barriers in the map (as in mazes), this method might fail.

bgbg
May I ask how this is possible when the map is like a maze?
Hosam Aly
For each grid you'd need to tally the entry/exit points (for passthroughs) and the distance from any points of interest within the sub-grid to the exits for that grid. Not the worst approach listed here either.
clintp
+1  A: 

Finished a python mockup of the dijkstra solution to the problem. Code got a bit long so I posted it somewhere else: http://refactormycode.com/codes/717-dijkstra-to-find-two-points-furthest-away-from-each-other

In the size I set, it takes about 1.5 seconds to run the algorithm for one node. Running it for every node takes a few minutes.

Dont seem to work though, it always displays the topleft and bottomright corner as the longest path; 58 tiles. Which of course is true, when you dont have obstacles. But even adding a couple of randomly placed ones, the program still finds that one the longest. Maybe its still true, hard to test without more advanced shapes.

But maybe it can at least show my ambition.

mizipzor
This uses n^3, and is therefore not a very good algorithm. Use the algorithm of Hosam Aly.
Georg
Seems I at least got mine working now. But it takes a long time. I will look into your code @gs tomorrow. Hopefully, I will be able to make it faster.http://refactormycode.com/codes/717-dijkstra-to-find-two-points-furthest-away-from-each-other#refactor_144521
mizipzor
+6  A: 

Follow up to the question about Floyd-Warshall or the simple algorithm of Hosam Aly:

I created a test program which can use both methods. Those are the files:

In all test cases Floyd-Warshall was by a great magnitude slower, probably this is because of the very limited amount of edges that help this algorithm to achieve this.

These were the times, each time the field was quadruplet and 3 out of 10 fields were an obstacle.

Size         Hosam Aly      Floyd-Warshall
(10x10)      0m0.002s       0m0.007s     
(20x20)      0m0.009s       0m0.307s
(40x40)      0m0.166s       0m22.052s
(80x80)      0m2.753s       -
(160x160)    0m48.028s      -

The time of Hosam Aly seems to be quadratic, therefore I'd recommend using that algorithm. Also the memory consumption by Floyd-Warshall is n2, clearly more than needed. If you have any idea why Floyd-Warshall is so slow, please leave a comment or edit this post.

PS: I haven't written C or C++ in a long time, I hope I haven't made too many mistakes.

Georg
Just glancing at your timings, the H-A algorithm as you coded it is O(n^2), whereas the F-W algorithm you wrote is O(n^3). Possibly you mislabeled something?
@casualcoder: No, that's actually correct as far as I can tell. "H-A" is just a breadth-first search, which is O(n), so running it n times is O(n^2). Seems F-W is actually pretty slow on sparse graphs (as we have here -- each vertex has only 4 edges), but it's much faster for dense graphs.
j_random_hacker
@gs: Your code looks fine to me but for one small bug: the H-A code stores distances in the same place you check for the "." character that indicates a position is "unblocked" -- you need separate arrays for these or your code will fail whenever distance can be >= 46 (the ASCII code for '.')!
j_random_hacker
@ randomhacker: That's why the distances are negative. If I was to write the code once more I'd use separate arrays. That would also be faster than copy the old map every time.
Georg
@gs: You're quite right, I missed that sorry. :)
j_random_hacker
Thanks a lot @gs for your review. I still have something that I want to discuss about Floyd's algorithm. How about if you make a small modification, by making the loop over `k` count 4 times only, and changing the if conditions in the inner loop to check the 4 directions explicitly?
Hosam Aly
... This should make it much faster, and it would be somehow similar to flood fill in running time. Could you please test it to make sure it would produce the same (correct) results? (I don't have gcc, and it has been a long time since I last coded C++, so I couldn't compile your code. Sorry.)
Hosam Aly
I haven't been successful yet, but I doubt it works because in this case only 4 edges would be tried instead of all edges between the start end the end.
Georg
+4  A: 

I deleted my original post recommending the Floyd-Warshall algorithm. :(

gs did a realistic benchmark and guess what, F-W is substantially slower than Hosam Aly's "flood fill" algorithm for typical map sizes! So even though F-W is a cool algorithm and much faster than Dijkstra's for dense graphs, I can't recommend it anymore for the OP's problem, which involves very sparse graphs (each vertex has only 4 edges).

For the record:

  • An efficient implementation of Dijkstra's algorithm takes O(Elog V) time for a graph with E edges and V vertices.
  • Hosam Aly's "flood fill" is a breadth first search, which is O(V). This can be thought of as a special case of Dijkstra's algorithm in which no vertex can have its distance estimate revised.
  • The Floyd-Warshall algorithm takes O(V^3) time, is very easy to code, and is still the fastest for dense graphs (those graphs where vertices are typically connected to many other vertices). But it's not the right choice for the OP's task, which involves very sparse graphs.
j_random_hacker
I support this, deleted my post about F-W too.
Georg
Why delete? Isnt votedown better?
mizipzor
It had 13 upvotes, and I don't think enough people would have downvoted it.
Georg
Well, I can't downvote my own post, and with 7 upvotes it would have taken a while to get downvoted. Just getting rid of it avoids misleading people.
j_random_hacker
Thanks gs, good on you by the way.
j_random_hacker
Thank you everyone.
Hosam Aly
+4  A: 

It sounds like what you want is the end points separated by the graph diameter. A fairly good and easy to compute approximation is to pick a random point, find the farthest point from that, and then find the farthest point from there. These last two points should be close to maximally separated.

For a rectangular maze, this means that two flood fills should get you a pretty good pair of starting and ending points.

Boojum
This can yield incredibly bad results if for example the point first picked is enclosed by walls and can only reach 3 other points.
Georg
True. This is assuming a fully connected graph, of course.
Boojum
Which is ok in this case since all nodes will be connected. Guaranteed from the way the map is generated.
mizipzor
+3  A: 

Raimund Seidel gives a simple method using matrix multiplication to compute the all-pairs distance matrix on an unweighted, undirected graph (which is exactly what you want) in the first section of his paper On the All-Pairs-Shortest-Path Problem in Unweighted Undirected Graphs [pdf].

The input is the adjacency matrix and the output is the all-pairs shortest-path distance matrix. The run-time is O(M(n)*log(n)) for n points where M(n) is the run-time of your matrix multiplication algorithm.

The paper also gives the method for computing the actual paths (in the same run-time) if you need this too.

Seidel's algorithm is cool because the run-time is independent of the number of edges, but we actually don't care here because our graph is sparse. However, this may still be a good choice (despite the slightly-worse-than n^2 run-time) if you want the all pairs distance matrix, and this might also be easier to implement and debug than floodfill on a maze.

Here is the pseudocode:

Let A be the nxn (0-1) adjacency matrix of an unweighted, undirected graph, G

All-Pairs-Distances(A)
    Z = A * A
    Let B be the nxn matrix s.t. b_ij = 1 iff i != j and (a_ij = 1 or z_ij > 0)
    if b_ij = 1 for all i != j return 2B - A //base case
    T = All-Pairs-Distances(B)
    X = T * A
    Let D be the nxn matrix s.t. d_ij = 2t_ij if x_ij >= t_ij * degree(j), otherwise d_ij = 2t_ij - 1
    return D

To get the pair of points with the greatest distance we just return argmax_ij(d_ij)

Imran
Interesting, but the sub-O(n^3) matrix multiplication algorithms are tricky to code and have large overheads -- according to Wikipedia, the O(n^2.376) algorithm is "not practical for matrices that can fit on today's computers." So in practice you'd use the O(n^3) algo, making this slower than F-W.
j_random_hacker
+1  A: 

Ok, "Hosam's algorithm" is a breadth first search with a preselection on the nodes. Dijkstra's algorithm should NOT be applied here, because your edges don't have weights.

The difference is crucial, because if the weights of the edges vary, you need to keep a lot of options (alternate routes) open and check them with every step. This makes the algorithm more complex. With the breadth first search, you simply explore all edges once in a way that garantuees that you find the shortest path to each node. i.e. by exploring the edges in the order you find them.

So basically the difference is Dijkstra's has to 'backtrack' and look at edges it has explored before to make sure it is following the shortest route, while the breadth first search always knows it is following the shortest route.

Also, in a maze the points on the outer border are not guaranteed to be part of the longest route. For instance, if you have a maze in the shape of a giant spiral, but with the outer end going back to the middle, you could have two points one at the heart of the spiral and the other in the end of the spiral, both in the middle!

So, a good way to do this is to use a breadth first search from every point, but remove the starting point after a search (you already know all the routes to and from it). Complexity of breadth first is O(n), where n = |V|+|E|. We do this once for every node in V, so it becomes O(n^2).