Discovered in 1956 by Joseph B. Kruskal. This algorithm has a close association with clustering algorithms. Clustering is a form of unsupervised learning that extracts patterns from unlabeled data. While Prim’s algorithm is a bit easier to implement, Kruskal’s algorithm has the added benefit of being able to calculate the MST for a graph that is too large to fit in a single memory space.
The pseudo code below is a particularly slow implementation ($O(mn)$) of Kruskal’s algorithm that utilizes a graph search algorithm. However, this simple implementation makes the algorithm easy to understand.
G = connected, undirected graph MST = empty graph (will contain edges in MST) E = array to hold sorted edges All add vertices in G to MST Add all edges in G to E Sort E by weight in ascending order // see BFS and DFS in Graph Search section creates_cycle: G = graph to search v = starting vertex w = end vertex if graph search from v-> w == not found: return false else return true for each edge in E: // Once we have n-1 edges, the MST is complete if MST->m == (G->n - 1): break if creates_cycle MST e->tail e->head: continue to next edge add e to MST return MST
This version is more complex, but results in a considerably faster run time ($O((m+n)\log n)$)
- Union-Find (
U) data structure represents all connected components that have been added to the MST (
G = connected, undirected graph T = empty set (will contain edges in MST) E = array to hold sorted edges U = Empty Union-Find Data Structure Initialize U with all vertices in G Add all edges in G to E Sort E by weight in ascending order for each edge in E: // Once we have n-1 edges, the MST is complete if |T| == (G->n - 1): break // If a cycle is detected, move the next edge if U->find e->tail == U->find e->head: continue to next edge else U->union e->head e->tail add e to T return T
Click here for build and run instructions