Discovered in 1956 by Joseph B. Kruskal. This algorithm has a close association with clustering algorithms. Clustering is a form of unsupervised learning that extracts patterns from unlabeled data. While Prim’s algorithm is a bit easier to implement, Kruskal’s algorithm has the added benefit of being able to calculate the MST for a graph that is too large to fit in a single memory space.
Asymptotic Complexity
$O((m+n)\log n)$
Pseudo Code
Naive Implementation
The pseudo code below is a particularly slow implementation ($O(mn)$) of Kruskal’s algorithm that utilizes a graph search algorithm. However, this simple implementation makes the algorithm easy to understand.
G = connected, undirected graph
MST = empty graph (will contain edges in MST)
E = array to hold sorted edges
All add vertices in G to MST
Add all edges in G to E
Sort E by weight in ascending order
// see BFS and DFS in Graph Search section
creates_cycle:
G = graph to search
v = starting vertex
w = end vertex
if graph search from v-> w == not found:
return false
else
return true
for each edge in E:
// Once we have n-1 edges, the MST is complete
if MST->m == (G->n - 1):
break
if creates_cycle MST e->tail e->head:
continue to next edge
add e to MST
return MST
Union-Find-Based Implementation
This version is more complex, but results in a considerably faster run time ($O((m+n)\log n)$)
Invariants:
- Union-Find (
U
) data structure represents all connected components that have been added to the MST (T
)
G = connected, undirected graph
T = empty set (will contain edges in MST)
E = array to hold sorted edges
U = Empty Union-Find Data Structure
Initialize U with all vertices in G
Add all edges in G to E
Sort E by weight in ascending order
for each edge in E:
// Once we have n-1 edges, the MST is complete
if |T| == (G->n - 1):
break
// If a cycle is detected, move the next edge
if U->find e->tail == U->find e->head:
continue to next edge
else
U->union e->head e->tail
add e to T
return T
Source Code
Relevant Files:
Click here for build and run instructions