The topic of this blog is path finding using Python. I'll start with Dijkstra's shortest path first (SPF) algorithm and then follow up in a later blog with the A* algorithm. These algorithms find the shortest path in a traversal of a directed graph. The graph is a set of nodes, or vertices, which are interconnected with edges. The path along an edge has an assigned cost in terms of length or time of traversal. The cost depends on direction. The objective is to find the shortest path from the source node to the destination node.
For generality we can think of two different graphs. Figure 1 shows a graph with six vertices (nodes) connected with edges that can have arbitrary lengths. The number of edges at a given node can be 0 to N-1, where N is the number of nodes. In this example node 'A' is the start node and node 'F' is the destination. The edges are labeled with the 'cost' of that edge. This is drawn as a non-directed graph but the solution I show will address the directed case as well.
Figure 1 - General non-directed graph
Figure 2 shows another non-directed graph where the vertices are a uniform equally spaced grid. Each node has at most eight neighbors. The distance to a neighbor moving horizontally or vertically is unity and the distance to a diagonal neighbor is sqrt(2). In this configuration if a node is not accessible it can be labeled as not accessible. As example in Figure 2 the start cell is labeled '15' and the destination is cell '9'. Cells '12' and '17' are marked as inaccessible.
Figure 2 - Grid non-directed graph
Both algorithms can be applied to these type of graphs. The choice of the graph type is a use case decision. The general graph might be used for path planning on a map of cities interconnected with roadways. The grid graph might be used in path planning of a robot moving within a warehouse.
Dijkstra's Algorithm
Dijkstra's shortest path algorithm was developed in 1955 by Edsger Dijkstra and first published in 1959. It is well documented and described here as a background for the A* algorithm.
An explanation of the algorithm follows. Any two adjacent nodes (u,v) will have an edge with weight w(u,v). The algorithm begins by initializing three variables:
- dist() is an array of distances from the source node, s, to every other node in the graph. To start dist(s) is initialized to 0 and all other values are set to infinity. My implementation in Python used a dictionary (dict) referenced by node string instead of an array.
- Q is initialized as a queue of all nodes in the graph. At the end of the algorithm Q will be empty. My implementation in Python used a list of strings.
- S is initialized as an empty queue. At the end of the algorithm S will contain all nodes in the graph. My Python implementation used a list of strings.
The iterative algorithm follows:
- While Q is not empty, pop the node v, that is not already in S, from Q with the smallest dist(v). In the first pass the source node s will be chosen because the dist(s) was initialized to 0. In the next pass, the next node with the smallest  value is chosen.
- Add node v to S, to indicate that v has been visited.
- Update dist() values of adjacent nodes of the current node v as follows for each new adjacent node u:
if dist(v) + w(u,v)Â < dist(u), here is a new minimal distance found for u, so update dist(u) to the new minimal distance value, otherwise, no updates are made to dist(u).
Upon completion the algorithm has visited all nodes in the graph and found the smallest distance to each node from the start. The values are held in dist(). Once the graph is solved we can work backwards from the end and reveal the path taken from the start.
For v nodes the complexity of a complete graph is roughly O(v**2).
Python Code
The beginning of the code has comments that show the algorithm in pseudo code (lines 6-29). I tried to keep the actual Python code as close as possible to the pseudo code.
Listing 1 Pseudo Code
The full code listing is given below. The input is given as a list of edges with a cost or weight. Note ['A', 'C', 10] denotes travel from 'A' to 'C' with a cost of 10. For a non-directed graph we will add ['C', 'A', 10] to the edge list. In this way the code can be used for directed or non-directed graphs. Choosing data structures can make the problem solution clean and easy. For the Graph structure I use a dictionary with the keys being the node names for each node and the entries being the a dictionary of the nodes pointed to.
The heavy lifting is done in the function 'spf' (line 33) with inputs edges, start and stop.
Listing 1 Dijkstra_spf.py
The output window for example 1 shows the path. The path displays that the destination 'F' was reached from 'B' and 'B' was reached from the start at 'A'. The total length or weight is 23. The display weights shows the cost to get to each node from the start at 'A'. The Graph variable displays the network graph as a Python dictionary.
The code listing contains a second example that is a directed graph with different weights. The output window is shown below:
The path displays that 't' was reached from 'd' which was reached from 'b' starting from 's'. The total cost from 's' to 't' is 8. The output window also displays the graph dictionary.
Closing
As mentioned I will cover the A* Search algorithm in a subsequent blog in coming weeks. Please leave your comments.
Peter Hart, Nils Nilsson and Bertram Raphael of Stanford Research Institute (now SRI International) first published the algorithm in 1968.
you guys are f – osum