views:

825

answers:

7

Problem: I have a large collection of points. Each of these points has a list with references to other points with the distance between them already calculated and stored. I need to determine the shortest route that begins from an origin and passes through a specific number of points to any destination.

Ex: I'm on vacation and I'm staying in a specific city. I'm making a ONE WAY trip to see ANY four cities and I want to travel the least distance possible. I cannot visit the same city more than once.

Current solution: Right now I'm just iterating through every possibility manually and storing the shortest path. This works but feels inefficient. Also, this problem will eventually be expanded to include searching from multiple origin points to multiple destination points, so I think that might explode the search space.

What is the better way to search for the shortest route?

A: 

This sounds Travelling Salesman-esque? One solution is to use an optimisation technique such as an evolutionary algorithm. Currently you are doing an exhaustive search, which will get very slow very quickly. But I think this is pretty much a travelling salesman problem and it has been tackled for several decades if not centuries, and such there are several possible ways of attack. Google is your friend.

zenna
its not TSP since you can pass in the same point more then once.
Shay Erlichmen
I thought the TSP involved knowing every point that needed to be touched? In my case there's no specific points that need to be included, just any points that follow the shortest path from an origin.
Chris Douglass
@Shay: there are lots of flavors of TSP, including those which allow a user to visit the same vertex multiple times.
Juliet
+2  A: 

In generally you should to strict bad variants... I think you should use some variations of Branch_and_bound method http://en.wikipedia.org/wiki/Branch%5Fand%5Fbound

Max
++ Yes. This is a tree-search optimization problem, so branch-and-bound applies to it. It can be done either breadth-first or depth-first.
Mike Dunlavey
+1  A: 

Either bredth first search as norheim.se said or Dijkstra's algorithm would be my suggestion as well.

Jeffrey Lee
+2  A: 

Answering to the updated post, your solution of checking every possibility is optimal (at least, noone has discovered better algorithms so far). Yes, that's a travelling salesman, whose essense is not touching every city, but touching every city once. If you don't want to search for best solution possible, you may find it useful to use heuristics that work faster, but allow for limited discrepancy from ideal solution.


For future answerers: Floyd-Warshall algorithm and all Floyd-like variations are inapplicable here.

Pavel Shved
Fun fact: O(N^5) > O(N!) up to N = 8 :)
Juliet
Thanks. I've optimized it somewhat by finding the immediate solution of finding the shortest individual routes all of the way down, then moving up one level and searching, then moving up another level (etc.) I guess that's a depth first search?
Chris Douglass
That can be called depth-first *traversal*, not a *search*. *Search* aims visiting each *node*, while your algorithm aims visiting each *path* and is therefore more costly.
Pavel Shved
There is no known polynomial time algorithm. However, this does not imply that checking every possibility is optimal. There are quite sophisticated algorithms for the traveling salesman problem and related problems that allow to solve large problems with e.g thousands of cities
Accipitridae
A: 

It appears that the edges of your graph are bidirectional. In this case, the algorithm you're looking for is Dijkstra's algorithm.

James Jones
+1  A: 
Martin Beckett
A: 

Perhaps this is what the original poster means by "iterating through each possibility manually and storing the shortest path", but I thought I'd like to make explicit what appears to be a baseline solution.

Assume you already have a two-point shortest path algorithm--this has classical solutions for various kinds of graphs. Assume all distances are nonnegative and d(A->B->C) = d(A->B) + d(B->C).

The essentials are that the path starts at S goes through one of intermediate cities "abcd" and ends with E:

e.g. SabcdE, SacbdE, etc...

With only 4 intermediate cities, you enumerate all 24 permutations. For each permutation use your shortest two-point algorithm to compute the path from head to tail, and its total distance.

Then given the start and ending point, there are 12 possibilities attaching to one of abcd and for each two possibilities for the interior. You've computed these distances already, so you add on the distance from S to the head and the tail to E. Choose minimum. So once you've precomputed the intermediate distances for a fixed set of interior cities you need to do 12 two point shortest path problems for any pair of start and end points.

This obviously scales poorly with increasing number of intermediate cities. It's not clear to me that it could do better unless you impose greater restrictions on the graph structure (is this in a physical Euclidenan space? Triangle inequality?).

My thought example: suppose all intermediate distances between cities are O(1). With no restriction on the graph, then the distance from S to any intermediate city might be 1000 except for one being 1. Same for the tail. So you can force the first city to be visited to be anything. Now, go one layer down, take the first city as the "start point". Apply the same argument: you can make the best path go to any of the following cities by manipulating the distances in the graph.

So it seems that the complexity can't be helped without additional assumptions.

Matt Kennel