A Brief Review of Shortest Path Algorithms

Posted on 2023-07-24 In Algorithm

Shortest Path

Given a graph find the shortest path from source vertex to end vertex. \(|V|\) denotes the number of vertices. \(|E|\) denotes the number of edges.

Dijkstra's Algorithm

Steps

Initialize a shortest path tree (spt) to keep tack of the vertices included in the shortest path. spt is an empty set in the beginning.
Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
Always add the vertex u with the minimum distance value to spt then update the distance value of all adjacent vertices of u.

To keep track of the path, create a parental array and update parent of each vertex each time when new vertex is added to spt.
If end vertex is added to spt, break the loop and return the distance value of end vertex.

Complexity

Time Complexity: \(O(|V|^2)\); Space Complexity: \(O(|V|)\).

More you should know about Dijkstra's Algorithm

Dijkstra's Algorithm is a greedy algorithm.
Proof of correctness of Dijkstra's Algorithm.

Distance value of vertex v is denoted as d[v]; Distance of the shortest path from s to v is denoted as delta(s, v)
1. Lemma 1: Shortest paths are composed of shortest paths.
2. Lemma 2: If s->\(\cdots\)->u->v is a shortest path from s to v, then after u is added to spt, d[v] is updated to delta(s, v) and will not change afterwards.
As per Dijkstra's Algorithm, we assert that d[u] = delta(s, u) for all u in spt
1. Assume u is the first vertex added to spt where d[u] > delta(s, u). There must exist a shortest path from s to u: s->\(\cdots\)->x->y->u. Assume x is within spt and y is the first vertex not in spt.
2. d[x] = delta(s, x) then update d[y]: d[y]=delta(s, y)<=delta(s, u)<=d[u].
3. If y is added to spt before u, then when u is added to spt: d[u]=d[y]+d(y, u)=delta(u). Contradiction
4. Else u is added to spt before y, then d[u]<=d[y]<=delta(s, u). Contradiction
Dijkstra's Algorithm supports directed graphs as well as undirected graphs.
Dijkstra's Algorithm does not work for graphs with negative weight edges.
With priority queue/heap, time complexity of Dijkstra's Algorithm can be reduced to \(O(|V| + |E|log|V|)\).
1. Graph should be presented as adjacency matrix
2. Use priority queue to keep track of vertex with minimum distance value. Since items in priority queue can not be modified, push multiple copies of vertices to priority queue.
3. It takes \(O(log|V|)\) to find the the vertex with minimum distance value then it takes \(O(|E|)\) to update all adjacent vertices.
If weights of all edges are same, Dijkstra's Algorithm is same with BFS (Breath-First Search).

Bellman-Ford Algorithm

The basic idea is: for a graph without negative cycle, the shortest path between any two vertices consist of at most V-1 edges.

Steps

Initialize a shortest path tree (spt) to keep tack of the vertices included in the shortest path. spt is an empty set in the beginning.
Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
Do following V-1 times:

For each edge u->v, update d[v] = min(d[v], d[u]+weight of u->v).
Traverse every edge u->v again: if d[v] > d[u]+weight of u->v, there exists negative weight cycle.

Complexity

Time Complexity: \(O(|E||V|)\); Space Complexity: \(O(|V|)\).

More you should know about Bellman-Ford Algorithm

Proof of correctness of Bellman-Ford Algorithm

Correctness of Bellman-Ford Algorithm is relatively intuitive. The basic idea is to prove by induction: after iteration \(i\) of Bellman-Ford, \(d[v] <= min\{w(p)\}\) where \(p\) is the possible path from \(u\) to \(v\) with at most \(i\) edges.
Bellman-Ford Algorithm works well for negative weight graphs without negative cycle and suits well for distributed systems.
Bellman-Ford Algorithm does not apply for undirected graph. (Undirected graph can be viewed as directed graph)
If no distance value is updated in one iteration, Bellman-Ford Algorithm can end in advance.
An improvement of Bellman-Ford Algorithm: Shortest Path Faster Algorithm (SPFA)
1. Initialize a priority queue Q and push source vertex in it.
2. Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
3. While Q is not empty:
  1. For each edge u->v: update d[v] = min(d[v], d[u]+weight of u->v) and if d[v] is updated and v not in Q then push v in Q.

Floyd Warshall Algorithm

This algorithm is aimed to find every pair of vertices in a weighted directed graph.

Steps

Initialize distance[v][u] to keep tack of the distance between v and u. Initially if there is a edgeu->v, set distance[v][u] to the weight of edge, else as infinity.

do following

for k in range(V):
	for i in range(V):
		for j in range(V):
			distance[i][j] = min(distance[i][j], distancep[i][k] + distance[k][j])

Complexity

Time Complexity: \(O(|V|^3)\); Space Complexity: \(O(|V|^2)\).

More you should know about Floyd Warshall Algorithm

The basic idea is to pick all vertices and update all shortest paths which include picked vertex as intermediate vertex.

Proof can be given via induction.
Floyd Warshall Algorithm works well for negative weight graphs without negative cycle and also suits well for distributed systems.
Floyd Warshall Algorithm does not apply for undirected graph. (Undirected graph can be viewed as directed graph)

A* Search Algorithm

A star algorithm is one of the most popular techniques used in path-finding especially games. Consider a square grid having obstacles and we want to reach to target cell from a given starting cell.

Let's introduce Heuristic Function of A* Algorithm: \[ f(n) = g(n)+h(n) \] \(f\) is the evaluation cost of node n; \(g\) is the movement cost from starting node to n; \(h\) is the estimated movement cost from n to goal node.

Steps

Initialize open list and close list. Put starting node on o and set f value of starting node to 0.
while open list is not empty:
1. Always find the node q with the least \(f\) and pop the node.
2. If q is the goal node, stop.
3. Else:
  1. Push q to close set
  2. Traverse all adjacent nodes m of q:
    1. Update \(f\); m.g = q.g + distance(q, m); m.h = distance(goal node, m)
    2. If m in closed list, skip
    3. If m in open list and m.g <= updated m.g, skip
    4. Else: update or push m to open list and set parent of m to q

Complexity

Complexity of A* Algorithm should be same with Dijkstra Algorithm (in the worst case).

But for A* Algorithm, measuring complexity in terms of \(|V|\) and \(|E|\) could be meaningless: \(|V|\) may be infinite and in most cases we won't examine the entire graph.

In the domain of AI, we measure complexity as:

Time Complexity: \(O(b^d)\) where \(b\) is the branching factor and \(d\) is the depth of goal node (typically the number of nodes on the resulting path.)

More you should know about A* Search Algorithm

A* Algorithm is a heuristic algorithm, starting with a good guess.
Performance of A* Algorithm largely depends on a good heuristic function.
1. If \(h(n)=0\), A* Algorithm is exactly same with Dijkstra Algorithm.
2. If \(h(n)\) is same with the \(distance(goal, n)\), great but time consuming.
3. If \(h(n)<distance(goal, n)\), A* Algorithm can always find the shortest path.
4. If \(h(n)>distance(goal, n)\), solution of A* Algorithm is not guaranteed to be the shortest path.
5. If \(h(n) >> distance(goal, n)\) which makes \(f(n)\) approximately equals to \(h(n)\), A* Algorithm will be same with Best-First Search.
A heuristic approximation way is used to calculate \(h(n)\):
1. If agent is just allowed to move in four directions only (up, down, left, right), use Manhattan Distance. \[ dx = abs(u.x-v.x) \\ dy = abs(u.y-v.y) \\ Manhatten(u, v) = D*(dx+dy) \] where \(D\) is the length of each node.
2. If agent is allowed to move in eight directions, use Diagonal Distance. \[ dx = abs(u.x-v.x) \\ dy = abs(u.y-v.y) \\ Diagonal(u, v) = D*(dx+dy) + (D2-2*D)*min(dx, dy) \] where \(D\) is the length of each node, \(D2\) is the diagonal distance of each node.
3. If agent is allowed to move in any direction, use Euclidean Distance. \[ Euclidean(u, v) = D*\sqrt{(u.x-v.x)^2+(u.y-v.y)^2} \] where \(D\) is the length of each node.
Correctness of A* Algorithm. Only provide an intuitive proof here.

If the heuristic function used by A* Algorithm is admissible then A* Algorithm is guaranteed to return an optimal solution.

Suppose the distance of the shortest path from u to v is delta(u, v) and p is a node in the shortest path and q is another node selected via A* algorithm but not in the shortest path.

The key idea is \(h(n)\) will be an accurate approximation when n is close enough to goal node:

\(f(q) = g(q)+h(q) > g(goal)\) and \(f(p) = g(p)+h(p) <= g(p)+distance(goal, p) = g(goal) < f(q)\)