views:

461

answers:

6

I've been tasked (coursework @ university) to implement a form of path-finding. Now, in-spec, I could just implement a brute force, since there's a limit on the number of nodes to search (begin, two in the middle, end), but I want to re-use this code and came to implement Dijkstra's algorithm.

I've seen the pseudo on Wikipedia and a friend wrote some for me as well, but it flat out doesn't make sense. The algorithm seems pretty simple and it's not a problem for me to understand it, but I just can't for the life of me visualize the code that would realize such a thing.

Any suggestions/tips?

Edit for some confusions:
Yes, there is a target node and a source node.
I'm looking to implement Dijkstra's in a general case, not the "only two intermediate stops" case, because I want to use the code again afterwards. Else, I'd just write a brute-force implementation.
The specific issue that I'm having a little trouble with is storing the sub-optimal half-formed paths, in case they may become optimal. When I'm visiting a given node, I just don't see how I'm going to update all the connections that go through it.
More edit:
Going through a couple of the answers now and having a go.

REALLY EDIT: I forgot to mention a serious complication, which is that any two vertices can have up to UINT_MAX different distances between them. Sorry. Infact, the fact that I forgot to deal with this is probably the cause of the damn problem in the first place, although the solution: pick the shortest is fortunately obvious to me. No wonder other people's pseudo for a distance variable didn't take into account my variable distance.

+1  A: 

The Wikipedia article link that I stuffed into the OP gives a very clear and concise description, along with animated graphics.

The key that may be missing(?) is that at each step of the algorithm you will possibly be updating the shortest path to each node connected to your "current" node. For your four-node "diamond" case, if A is the start and D is the end, first you compute distances to B and C, then from B you compute the distance from A to D then you do it also through C. If the path through C is shorter, then the "shortest path from A to D" is through C. If the path through C is longer, then the shortest path goes through B. This should obviously(?) extend to more complex networks.

On the OP's Really Edit: It doesn't matter how many connections you have between two nodes. Indeed, that's the point of the algorithm, check all of the connections through all of the possible paths. If node A and node B are connected by two roads, and you want the shortest road, don't worry about the longer road, just throw it away. Always try to discard data that you know is not relevant to your problem.

dash-tom-bang
+1  A: 

The first thing you need is a way of representing a graph. Typically this is a collection of vertex objects, which each contain an adjacency list. In c++ this could be a list of pointers to other vertices. Make sure that vertices are LessThanComparable. You could, for example, give each a unique ID number.

Then, the code on Wikipedia should make more sense. Every time you have pseudocode like dist[v], you want to use a map<VertexIdNumber, double>.

rlbond
+3  A: 

Here's a high level breakdown of Dijkstra's algorithm:

You stick all of the vertices in a priority queue where all of the vertices have a priority (distance) of infinity except for the source vertex, which has a distance of zero (the source vertex is zero units of distance away from itself, right?).

Pop the priority queue. The vertex removed is the vertex with the shortest distance from the source. Obviously the first vertex popped from the queue is the source. Well call that popped vertex v.

Look at each of the neighbors of v. All of them will have a distance greater than v's distance (otherwise we would have already popped them from the queue, right?). Let's say v has a distance of 3, and v has 3 neighbors x (dist-from-source: 5), y (dist-from-source: 10) and z (dist-from-source: infinity).

Now we look at each neighbors distance from v. Let's say they are: x - 3, y - 2, z - 4. This means that the path from the source to x that goes through v has a distance of |v| + 3 (3 + 3 = 6), y has a distance of 5 (3 + 2 = 5) and z has a distance of 7 (3 + 4 = 7).

The path to x through v is longer than the current shortest path to x so we ignore it. However the paths to y and z that go through v are shorter than the previous known shortest path so we update them.

You keep doing this until you have gone through all the vertices. At each point, when you pop the min from the priority queue you know you have found the shortest path to that vertex because any possible shorter path must pass through a vertex with a distance less than v's, which means we would have already popped that from the queue.

Niki Yoshiuchi
Maybe not obvious, but you don't have to go through all of the vertices. You just have to go through all of the vertices until you pop the target vertex off the pile. There may still be vertices in the queue that are further away from the source than the target vertex, but those vertices don't matter since there's obviously not going to be a shorter path through them!
dash-tom-bang
DeadMG didn't say that he had a target so I assumed that he wanted the shortest path for all vertices. If he is searching for the shortest path from source to target then you are correct, although in that case I would recomment A*.
Niki Yoshiuchi
Niki, DeadMG said there were four nodes: "begin, two in the middle, end." I think it's safe to assume the goal is the shortest path from begin to end, so the target is end.
Rob Kennedy
It wouldn't matter for the OP's specific case since there are only four nodes, but if he wants a general reusable algorithm, sticking all the vertices in the priority queue upfront won't scale. There's not much point anyway, since you're going to update them before you pop them - so it's easier to just insert them when first reached.
Peter
+2  A: 

Implementing Dijksta's Algorithm in C++ if you've never written anything like it is a pretty intense problem. Having read the Wikipedia page, here's a some ideas to get you started.

struct Vertex {
    int id;
    std::vector<int> neighbors;
    int dist_from_source;  // need some sentry value for "infinity"
    int prev;  // need some sentry value for "undefined"
};

This is based on the first few lines of pseudocode. A Graph could be std::vector<Vertex> along with some way to identify the distance between two vertices.

8     u := vertex in Q with smallest dist[]

This line indicates a need for std::make_heap (not priority_queue as we'll see later). Make a separate vector for this because we'll need to add and remove things from it. Look up how to provide a predicate that puts the Vertex with the lowest dist_from_source on the top of the heap.

12      for each neighbor v of u: // where v has not yet been removed from Q.

Here's why we don't use priority_queue for Q. You need to find out whether v is still in Q. A priority_queue doesn't let you do that.

13        alt := dist[u] + dist_between(u, v)

Now you need the distance function that came with Graph. You didn't say how the graph data is defined, so you're kinda on your own here.

17      return dist[]

This line just means return whatever data is necessary to produce the shortest paths. It's basically the set of Vertexes with all of them having prev and dist_from_source filled in.

Kristo
A: 

The specific issue that I'm having a little trouble with is storing the sub-optimal half-formed paths, in case they may become optimal. When I'm visiting a given node, I just don't see how I'm going to update all the connections that go through it.

I think maybe you're misunderstanding the algorithm a bit. Dijkstra works by exploring the nodes in increasing order of distance; so you are guaranteed to have found the minimum distance and optimal path to any node that has been permanently labelled.

You wouldn't normally store any paths explicitly when running the algorithm. Instead, consider that you are building a spanning tree on the graph - so there's only one way to get to each node on that tree. All you need to store against each node is a distance label and its parent. When you first see each node during the search, you'll label it tentatively; it's possible that later you'll find a better route to it - but all you have to update at that point are the distance and parent labels for that single node.

Once you permanently label your destination, you can stop, and then you can get the optimal route to it by walking up the parent labels back to the source.

Hope that helps :)

Peter
A: 

This response may be too late for you but I offer it in case it is of help to others.

Firstly you need to clarify whether

  1. the edges between nodes have different weights
  2. the graph is directed or undirected

It's been 25 years since I studied such algorithms at university, but from memory the Warshall algorithm is easier to implement by matrix method. You might look here: [http://www.ugrad.cs.ubc.ca/~cs490/Spring05/notes/graph_part3.pdf].