A Brief Review of Shortest Path Algorithms
Shortest Path
Given a graph find the shortest path from source vertex to end vertex. \(|V|\) denotes the number of vertices. \(|E|\) denotes the number of edges.
Dijkstra's Algorithm
Steps
Initialize a shortest path tree (
spt
) to keep tack of the vertices included in the shortest path.spt
is an empty set in the beginning.Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
Always add the vertex
u
with the minimum distance value tospt
then update the distance value of all adjacent vertices ofu
.To keep track of the path, create a parental array and update parent of each vertex each time when new vertex is added to
spt
.If end vertex is added to
spt
, break the loop and return the distance value of end vertex.
Complexity
Time Complexity: \(O(|V|^2)\); Space Complexity: \(O(|V|)\).
More you should know about Dijkstra's Algorithm
Dijkstra's Algorithm is a greedy algorithm.
Proof of correctness of Dijkstra's Algorithm.
Distance value of vertex
v
is denoted asd[v]
; Distance of the shortest path froms
tov
is denoted asdelta(s, v)
- Lemma 1: Shortest paths are composed of shortest paths.
- Lemma 2: If
s
->\(\cdots\)->u
->v
is a shortest path froms
tov
, then afteru
is added tospt
,d[v]
is updated todelta(s, v)
and will not change afterwards.
As per Dijkstra's Algorithm, we assert that
d[u] = delta(s, u)
for allu
inspt
- Assume
u
is the first vertex added tospt
whered[u] > delta(s, u)
. There must exist a shortest path froms
tou
:s
->\(\cdots\)->x
->y
->u
. Assumex
is withinspt
andy
is the first vertex not inspt
. d[x] = delta(s, x)
then updated[y]
:d[y]=delta(s, y)<=delta(s, u)<=d[u]
.- If
y
is added tospt
beforeu
, then whenu
is added tospt
:d[u]=d[y]+d(y, u)=delta(u)
. Contradiction - Else
u
is added tospt
beforey
, thend[u]<=d[y]<=delta(s, u)
. Contradiction
Dijkstra's Algorithm supports directed graphs as well as undirected graphs.
Dijkstra's Algorithm does not work for graphs with negative weight edges.
With priority queue/heap, time complexity of Dijkstra's Algorithm can be reduced to \(O(|V| + |E|log|V|)\).
- Graph should be presented as adjacency matrix
- Use priority queue to keep track of vertex with minimum distance value. Since items in priority queue can not be modified, push multiple copies of vertices to priority queue.
- It takes \(O(log|V|)\) to find the the vertex with minimum distance value then it takes \(O(|E|)\) to update all adjacent vertices.
If weights of all edges are same, Dijkstra's Algorithm is same with BFS (Breath-First Search).
Bellman-Ford Algorithm
The basic idea is: for a graph without negative cycle, the shortest
path between any two vertices consist of at most V-1
edges.
Steps
Initialize a shortest path tree (
spt
) to keep tack of the vertices included in the shortest path.spt
is an empty set in the beginning.Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
Do following
V-1
times:For each edge
u
->v
, updated[v] = min(d[v], d[u]+weight of u->v)
.Traverse every edge
u
->v
again: ifd[v] > d[u]+weight of u->v
, there exists negative weight cycle.
Complexity
Time Complexity: \(O(|E||V|)\); Space Complexity: \(O(|V|)\).
More you should know about Bellman-Ford Algorithm
Proof of correctness of Bellman-Ford Algorithm
Correctness of Bellman-Ford Algorithm is relatively intuitive. The basic idea is to prove by induction: after iteration \(i\) of Bellman-Ford, \(d[v] <= min\{w(p)\}\) where \(p\) is the possible path from \(u\) to \(v\) with at most \(i\) edges.
Bellman-Ford Algorithm works well for negative weight graphs without negative cycle and suits well for distributed systems.
Bellman-Ford Algorithm does not apply for undirected graph. (Undirected graph can be viewed as directed graph)
If no distance value is updated in one iteration, Bellman-Ford Algorithm can end in advance.
An improvement of Bellman-Ford Algorithm: Shortest Path Faster Algorithm (SPFA)
- Initialize a priority queue
Q
and push source vertex in it. - Assign distance value to all vertices, initially as infinity. Set distance value of source vertex to 0.
- While
Q
is not empty:- For each edge
u
->v
: updated[v] = min(d[v], d[u]+weight of u->v)
and ifd[v]
is updated andv
not inQ
then pushv
inQ
.
- For each edge
- Initialize a priority queue
Floyd Warshall Algorithm
This algorithm is aimed to find every pair of vertices in a weighted directed graph.
Steps
Initialize
distance[v][u]
to keep tack of the distance betweenv
andu
. Initially if there is a edgeu
->v
, setdistance[v][u]
to the weight of edge, else as infinity.do following
1
2
3
4for k in range(V):
for i in range(V):
for j in range(V):
distance[i][j] = min(distance[i][j], distancep[i][k] + distance[k][j])
Complexity
Time Complexity: \(O(|V|^3)\); Space Complexity: \(O(|V|^2)\).
More you should know about Floyd Warshall Algorithm
The basic idea is to pick all vertices and update all shortest paths which include picked vertex as intermediate vertex.
Proof can be given via induction.
Floyd Warshall Algorithm works well for negative weight graphs without negative cycle and also suits well for distributed systems.
Floyd Warshall Algorithm does not apply for undirected graph. (Undirected graph can be viewed as directed graph)
A* Search Algorithm
A star algorithm is one of the most popular techniques used in path-finding especially games. Consider a square grid having obstacles and we want to reach to target cell from a given starting cell.
Let's introduce Heuristic Function of A* Algorithm:
\[
f(n) = g(n)+h(n)
\] \(f\) is the
evaluation cost of node n
; \(g\) is the movement cost from starting node
to n
; \(h\) is the
estimated movement cost from n
to goal node.
Steps
- Initialize open list and close list. Put starting
node on
o
and setf
value of starting node to 0. - while open list is not empty:
- Always find the node
q
with the least \(f\) and pop the node. - If
q
is the goal node, stop. - Else:
- Push
q
to close set - Traverse all adjacent nodes
m
ofq
:- Update \(f\);
m.g = q.g + distance(q, m)
;m.h = distance(goal node, m)
- If
m
in closed list, skip - If
m
in open list andm.g <= updated m.g
, skip - Else: update or push
m
to open list and set parent ofm
toq
- Update \(f\);
- Push
- Always find the node
Complexity
Complexity of A* Algorithm should be same with Dijkstra Algorithm (in the worst case).
But for A* Algorithm, measuring complexity in terms of \(|V|\) and \(|E|\) could be meaningless: \(|V|\) may be infinite and in most cases we won't examine the entire graph.
In the domain of AI, we measure complexity as:
Time Complexity: \(O(b^d)\) where \(b\) is the branching factor and \(d\) is the depth of goal node (typically the number of nodes on the resulting path.)
More you should know about A* Search Algorithm
A* Algorithm is a heuristic algorithm, starting with a good guess.
Performance of A* Algorithm largely depends on a good heuristic function.
- If \(h(n)=0\), A* Algorithm is exactly same with Dijkstra Algorithm.
- If \(h(n)\) is same with the \(distance(goal, n)\), great but time consuming.
- If \(h(n)<distance(goal, n)\), A* Algorithm can always find the shortest path.
- If \(h(n)>distance(goal, n)\), solution of A* Algorithm is not guaranteed to be the shortest path.
- If \(h(n) >> distance(goal, n)\) which makes \(f(n)\) approximately equals to \(h(n)\), A* Algorithm will be same with Best-First Search.
A heuristic approximation way is used to calculate \(h(n)\):
If agent is just allowed to move in four directions only (up, down, left, right), use Manhattan Distance. \[ dx = abs(u.x-v.x) \\ dy = abs(u.y-v.y) \\ Manhatten(u, v) = D*(dx+dy) \] where \(D\) is the length of each node.
If agent is allowed to move in eight directions, use Diagonal Distance. \[ dx = abs(u.x-v.x) \\ dy = abs(u.y-v.y) \\ Diagonal(u, v) = D*(dx+dy) + (D2-2*D)*min(dx, dy) \] where \(D\) is the length of each node, \(D2\) is the diagonal distance of each node.
If agent is allowed to move in any direction, use Euclidean Distance. \[ Euclidean(u, v) = D*\sqrt{(u.x-v.x)^2+(u.y-v.y)^2} \] where \(D\) is the length of each node.
Correctness of A* Algorithm. Only provide an intuitive proof here.
If the heuristic function used by A* Algorithm is admissible then A* Algorithm is guaranteed to return an optimal solution.
Suppose the distance of the shortest path from
u
tov
isdelta(u, v)
andp
is a node in the shortest path andq
is another node selected via A* algorithm but not in the shortest path.The key idea is \(h(n)\) will be an accurate approximation when
n
is close enough to goal node:\(f(q) = g(q)+h(q) > g(goal)\) and \(f(p) = g(p)+h(p) <= g(p)+distance(goal, p) = g(goal) < f(q)\)