# Kruskal's Minimum Spanning Tree

Discovered in 1956 by Joseph B. Kruskal. This algorithm has a close association with clustering algorithms. Clustering is a form of unsupervised learning that extracts patterns from unlabeled data. While Prim’s algorithm is a bit easier to implement, Kruskal’s algorithm has the added benefit of being able to calculate the MST for a graph that is too large to fit in a single memory space.

## Asymptotic Complexity

$O((m+n)\log n)$

## Pseudo Code

### Naive Implementation

The pseudo code below is a particularly slow implementation ($O(mn)$) of Kruskal’s algorithm that utilizes a graph search algorithm. However, this simple implementation makes the algorithm easy to understand.

G = connected, undirected graph
MST = empty graph (will contain edges in MST)
E = array to hold sorted edges

All add vertices in G to MST
Add all edges in G to E
Sort E by weight in ascending order

// see BFS and DFS in Graph Search section
creates_cycle:
G = graph to search
v = starting vertex
w = end vertex

return false
else
return true

for each edge in E:
// Once we have n-1 edges, the MST is complete
if MST->m == (G->n - 1):
break
continue to next edge

return MST


### Union-Find-Based Implementation

This version is more complex, but results in a considerably faster run time ($O((m+n)\log n)$)

Invariants:

• Union-Find (U) data structure represents all connected components that have been added to the MST (T)
G = connected, undirected graph
T = empty set (will contain edges in MST)
E = array to hold sorted edges
U = Empty Union-Find Data Structure

Initialize U with all vertices in G
Add all edges in G to E
Sort E by weight in ascending order

for each edge in E:
// Once we have n-1 edges, the MST is complete
if |T| == (G->n - 1):
break
// If a cycle is detected, move the next edge
if U->find e->tail == U->find e->head:
continue to next edge
else