views:

25688

answers:

16

How do map providers (such as Google or Yahoo! Maps) suggest directions?

I mean, they probably have real-world data in some form, certainly including distances but also perhaps things like driving speeds, presence of sidewalks, train schedules, etc. But suppose the data were in a simpler format, say a very large directed graph with edge weights reflecting distances. I want to be able to quickly compute directions from one arbitrary point to another. Sometimes these points will be close together (within one city) while sometimes they will be far apart (cross-country).

Graph algorithms like Dijkstra's algorithm will not work because the graph is enormous. Luckily, heuristic algorithms like A* will probably work. However, our data is very structured, and perhaps some kind of tiered approach might work? (For example, store precomputed directions between certain "key" points far apart, as well as some local directions. Then directions for two far-away points will involve local directions to a key points, global directions to another key point, and then local directions again.)

What algorithms are actually used in practice?

PS. This question was motivated by finding quirks in online mapping directions. Contrary to the triangle inequality, sometimes Google Maps thinks that X-Z takes longer and is farther than using an intermediate point as in X-Y-Z. But maybe their walking directions optimize for another parameter, too?

PPS. Here's another violation of the triangle inequality that suggests (to me) that they use some kind of tiered approach: X-Z versus X-Y-Z. The former seems to use prominent Boulevard de Sebastopol even though it's slightly out of the way.

Edit: Neither of these examples seem to work anymore, but both did at the time of the original post.

+11  A: 

See this paper:

http://avglab.com/andrew/pub/alenex06.pdf

The Google Maps anomaly is really strange, I wonder what path is actually better.

jpalecek
Thanks for the paper. With regards to the anomaly, I have more than 20 more where that came from.
A. Rex
Abstract: We study the point-to-point shortest path prob. in a setting where preproc. is allowed. We improve the reach-based approach of Gutman [17] in several ways. In particular, we introduce a bidir. version of the alg. that uses implicit lower bounds and we add shortcut arcs to reduce vertex reaches. Our modifications greatly improve both preprocessing and query times. The resulting algorithm is as fast as the best previous method, due to Sanders and Schultes [28]. However, our algorithm is simpler and combines in a natural way with A∗ search, which yields significantly better query times.
Vebjorn Ljosa
+1  A: 

This is pure speculation on my part, but I suppose that they may use an influence map data structure overlaying the directed map in order to narrow the search domain. This would allow the search algorithm to direct the path to major routes when the desired trip is long.

Given that this is a Google app, it's also reasonable to suppose that a lot of the magic is done via extensive caching. :) I wouldn't be surprised if caching the top 5% most common Google Map route requests allowed for a large chunk (20%? 50%?) of requests to be answered by a simple look-up.

Parappa
Do you have a good reference for "an influence map data structure"? Thanks!
A. Rex
A: 

FWIW a bloke of my aquaintance did a toy route finding app with a genetic algorithm which was surprisingly quick and effective.

mr calendar
http://dx.doi.org/10.1109/CIG.2007.368081looks related. I've seen demos online that show how to use GAs for travelling salesman. I wonder if there's a similar page for this.
A. Rex
@ Mitch Wheat, sorry, a colleague knocked a demo up (some years ago now) and showed it to me - I can't give chapter and verse I'm afraid.
mr calendar
+6  A: 

Graph algorithms like Dijkstra's algorithm will not work because the graph is enormous.

This argument doesn't necessarily hold because Dijkstra will not usually look at the complete graph but rather just a very small subset (the better interconnected the graph, the smaller this subset).

Dijkstra may actually perform rather well for well-behaved graphs. On the other hand, with careful parametrization A* will always perform just as good, or better. Have you already tried how it would perform on your data?

That said, I'd also be very interested to hear about other peoples' experiences. Of course, prominent examples like Google Map's search are particularly interesting. I could imagine something like a directed nearest neighbour heuristic.

Konrad Rudolph
Suppose you are trying to find directions from point A to B, the optimal distance for which is d. Dijkstra's algorithm will, at the very least, examine all points at distance at most d from A. If A is San Francisco and B is Boston, this means it examines most of the US. N'est-ce pas?
A. Rex
Yes, it is. What I wanted to get at is that A* can be used instead and that it always finds an optimal solution (even though it uses a heuristic)!
Konrad Rudolph
Yes, of course. After I wrote my original post, I thought about the word "heuristic" that I used. It's a bit inaccurate here, because it does find an optimal solution.
A. Rex
Well, the point is that A* *uses* a heuristic (as opposed to *being* one) to determine the next step. This one decision can indeed be suboptimal. But it only affects runtime, not the result, since it only determines how much better than Dijstra it guesses.
Konrad Rudolph
+4  A: 

I read somewhere that Google maps uses A* - the article was about how their super-fast A* implementation enabled the real-time dragging of route which they now offer.

Will Dean
source ?
borisCallens
I think this is the article he is referencing: http://googleblog.blogspot.com/2007/11/road-to-better-path-finding.html ... I don't see anything about A* though
FryGuy
The article actually implies that they use precomputed partial solutions that are stitched together, and that the precomputation was in itself a computationally complex project and took 10 months on a gigantic cluster network, using Google's MapReduce-API.
Konrad Rudolph
Yeah. I'd love to know what exactly they precomputed, and how they stitch things together ...
A. Rex
A: 

I see what's up with the maps in the OP:

Look at the route with the intermediate point specified: The route goes slightly backwards due to that road that isn't straight.

If their algorithm won't backtrack it won't see the shorter route.

Loren Pechtel
Interesting idea. I've added another violation in my PPS to the OP. Please take a look and see if you can see an explanation there.
A. Rex
Zoom *WAY* down on point A--one click from max. Note how the three-point route goes west, south, east. I think we are looking at an algorithm that doesn't like to backtrack unless it was necessary to go through a chokepoint.
Loren Pechtel
In my PPS example, change the starting address to "10 Avenue de Flandre, 75019 Paris". This removes the little backtrack that you're talking about but the problem is still there. I think the main issue is that it really wants to stay on that main Blvd ...
A. Rex
I think I found it in this case: Do those by car and the timings make sense. It probably sees the big road as faster and the walking route doesn't throttle it.
Loren Pechtel
P.S.: The initial problem also makes sense by this standard, it might not be the backtrack that caused it.
Loren Pechtel
+180  A: 

Speaking as someone who spent 18 months working at a mapping company, which included working on the routing algorithm... yes, Dijkstra's does work, with a couple of modifications:

  • Instead of doing Dijkstra's once from source to dest, you start at each end, and expand both sides until they meet in the middle. This eliminates roughly half the work (2*pi*(r/2)^2 vs pi*r^2).
  • To avoid exploring the back-alleys of every city between your source and destination, you can have several layers of map data: A 'highways' layer that contains only highways, a 'secondary' layer that contains only secondary streets, and so forth. Then, you explore only smaller sections of the more detailed layers, expanding as necessary. Obviously this description leaves out a lot of detail, but you get the idea.

With modifications along those lines, you can do even cross-country routing in a very reasonable timeframe.

Nick Johnson
Someone who worked on this in the real world, awesome! Do you have any idea how much it's possible to precompute, as in the article about Google mentioned in another comment?
A. Rex
We didn't do any preprocessing of that nature, but it certainly seems like an interesting optimisation.
Nick Johnson
Yeah, it seems Google does it. See the comments on Will Dean's answer.
A. Rex
Did you ever try A* instead of bidirectional Djikstra? Did you have coordinates for the vertices? If so, it sounds like you could do quite better with some heuristics.
Pål GD
A* is likely to be more efficient, but it's only guaranteed to produce _a_ solution, not necessarily the most efficient one. That can be a problem for users, where an inefficient solution translates into lost time or money, or the user concluding your software sucks. Still, you could probably build in some heuristic about when to abandon A* for Dijkstra's.
Nick Johnson
"it's only guaranteed to produce a solution, not necessarily the most efficient one"This is untrue; as long as the heuristic used is admissible, the A* algorithm produces the least-cost path. Admissible means that the cost is never over-estimated, but it may be underestimated (but poor estimates will slow the algorithm). Using data from pre-processing is likely to aid in making a better admissible heuristic, and therefore making A* work faster.
a1kmm
I don't see how A* with an estimator like that can be more efficient than dijkstra's, then. If it finds a path of cost n, in order to be certain there are no cheaper paths, it must have explored all paths of cost <n, which is exactly what Dijkstra's does.
Nick Johnson
Actually, on further consideration, you're entirely right. You could enhance an existing algorithm to make use of this by simply adding the Great Circle Distance between the target node and the destination to the cost of the edge being evaluated. I'm actually not sure if our algorithm did that - but it's a very sensible optimization.
Nick Johnson
A simple A* heuristic uses the geometric distance to the destination divided by the maximum possible travel speed. Thus, A* will proceed by searching paths that are headed in the right direction. Dijkstra's is terrible because it stupidly searches in all directions.
John C
A*, in the worst case (a heuristic that says all paths are equivalent), is exactly equal to Djikstra's.
Tordek
+44  A: 

This question has been an active area of research in the last years. The main idea is to do a preprocessing on the graph once, to speed up all following queries. With this additional information itineraries can be computed very fast. Still, Dijkstra's Algorithm is the basis for all optimisations.

Arachnid described the usage of bidirectional search and edge pruning based on hierarchical information. These speedup techniques work quite well, but the most recent algorithms outperform these techniques by all means. With current algorithms a shortest paths can be computed in considerable less time than one millisecond on a continental road network. A fast implementation of the unmodified algorithm of Dijkstra needs about 10 seconds.

The article Engineering Fast Route Planning Algorithms gives an overview of the progress of research in that field. See the references of that paper for further information.

The fastest known algorithms do not use information about the hierarchical status of the road in the data, i.e. if it is a highway or a local road. Instead, they compute in a preprocessing step an own hierarchy that optimised to speed up route planning. This precomputation can then be used to prune the search: Far away from start and destination slow roads need not be considered during Dijkstra's Algorithm. The benefits are very good performance and a correctness guarantee for the result.

The first optimised route planning algorithms dealt only with static road networks, that means an edge in the graph has a fixed cost value. This not true in practice, since we want to take dynamic information like traffic jams or vehicle dependent restrictrions into account. Latest algorithms can also deal with such issues, but there are still problems to solve and the research is going on.

If you need the shortest path distances to compute a solution for the TSP, then you are probably interested in matrices that contain all distances between your sources and destinations. For this you could consider Computing Many-to-Many Shortest Paths Using Highway Hierarchies. Note, that this has been improved by newer approaches in the last 2 years.

SebastianK
Enlightening, indeed. What are the newer approaches you are mentioning ?
Tomas Pajonk
In particular Contraction Hierarchies. You can find more about it here: http://algo2.iti.kit.edu/routeplanning.php. There is also a google tech talk about it: http://www.youtube.com/watch?v=-0ErpE8tQbw
SebastianK
+2  A: 

I had some more thoughts on this:

1) Remember that maps represent a physical organization. Store the latitude/longitude of every intersection. You don't need to check much beyond the points that lie in the direction of your target. Only if you find yourself blocked do you need to go beyond this. If you store an overlay of superior connections you can limit it even more--you will normally never go across one of those in a way that goes away from your final destination.

2) Divide up the world into a whole bunch of zones defined by limited connectivity, define all connectivity points between the zones. Find what zones your source and target are in, for the start and end zone route from your location to each connection point, for the zones between simply map between connection points. (I suspect a lot of the latter is already pre-calculated.)

Note that zones can be smaller than a metropolitan area. Any city with terrain features that divide it up (say, a river) would be multiple zones.

Loren Pechtel
+9  A: 

Just addressing the triangle inequality violations, hopefully the extra factor they're optimising for is common sense. You don't necessarily want the shortest or fastest route, as it can lead to chaos and destruction. If you want your directions to prefer the major routes that are truck-friendly and can cope with having every sat-nav-following driver sent down them, you quickly discard the triangle inequality[1].

If Y is a narrow residential street between X and Z, you probably do only want to use the shortcut via Y if the user explicitly asks for X-Y-Z. If they ask for X-Z, they should stick to major roads even if it's a bit further and takes a bit longer. It's similar to Braess's paradox - if everyone tries to take the shortest, fastest route, the resulting congestion means that it's not the fastest route for anyone any more. From here we stray from graph theory into game theory.

[1] In fact, any hope that the distances produced will be a distance function in the mathematical sense dies when you allow one-way roads and lose the symmetry requirement. Losing the triangle inequality too is just rubbing salt in the wound.

stevemegson
+7  A: 

I've not worked on Google or Microsoft or Yahoo Maps before, so I can't tell you how they work.

However, I did architect a custom supply chain optimization system for an energy company which included a scheduling and routing application for their fleet of trucks. However, our criteria on routing was far more business-specific than where is construction or traffic slows or lane closures.

We employed a technique called ACO (Ant colony optimization) to schedule and route trucks. This technique is an AI technique that was applied to the traveling salesman problem to solve routing problems. The trick with ACO is to build an error calculation based upon known facts of the routing so that the graph solving model knows when to quit (when is the error small enough).

You can google ACO or TSP to find more on this technique. I've not used any of the open-source AI tools for this however, so cannot suggest one (though I heard SWARM was pretty comprehensive).

Bill
+2  A: 

Probably similar to the answer on pre-computed routes between major locations and layered maps, but my understanding is that in games, to speed up A*, you have a map that is very coarse for macro navigation, and a fine-grained map for navigation to the boundary of macro directions. So you have 2 small paths to calculate, and hence your search space is much much smaller than simply doing a single path to the destination. And if you're in the business of doing this a lot, you'd have a lot of that data pre-computed so at least part of the search is a search for pre-computed data, rather than a search for a path.

+4  A: 

Please take a look of this article: http://research.microsoft.com/en-us/news/features/shortestpath-070709.aspx

+1  A: 

I've done this quite a lot of times, actually, trying several different methods. Depending on the size (geographical) of the map, you might want to consider using the haversine function as a heuristic.

The best solution I've made was using A* with a straight line distance as a heuristic function. But then you need some sort of coordinates for each point (intersection or vertex) on the map. You can also try different weightings for the heuristic function, i.e.

f(n) = k*h(n) + g(n)

where k is some constant greater than 0.

Pål GD
+4  A: 

I am a little suprised to not see Floyd Warshall's algorithm mentioned here. This algorithm work's very much like Dijkstra's. It also has one very nice features which is it allows you to compute as long as you would like to continue allowing more intermediate vertices. So it will naturally find the routes which use interstates or highways fairly quickly.

dennisjtaylor
+2  A: 

Here's the world's fastest routing algorithms compared and proven for correctness:

http://algo2.iti.uka.de/schultes/hwy/schultes_diss.pdf

Here's a google tech talk on the subject:

http://www.youtube.com/watch?v=-0ErpE8tQbw

Here's a implementation of the highway-hierarchies algorithm as discussed by schultes (currently in berlin only, I'm writing the interface and a mobile version is being developed as well):

http://tom.mapsforge.org/

eikes