Kruskal's Minimum Spanning Tree

Discovered in 1956 by Joseph B. Kruskal. This algorithm has a close association with clustering algorithms. Clustering is a form of unsupervised learning that extracts patterns from unlabeled data. While Prim’s algorithm is a bit easier to implement, Kruskal’s algorithm has the added benefit of being able to calculate the MST for a graph that is too large to fit in a single memory space.

Asymptotic Complexity

$O((m+n)\log n)$

Pseudo Code

Naive Implementation

The pseudo code below is a particularly slow implementation ($O(mn)$) of Kruskal’s algorithm that utilizes a graph search algorithm. However, this simple implementation makes the algorithm easy to understand.

G = connected, undirected graph
MST = empty graph (will contain edges in MST)
E = array to hold sorted edges

All add vertices in G to MST
Add all edges in G to E
Sort E by weight in ascending order

// see BFS and DFS in Graph Search section
creates_cycle:
    G = graph to search
    v = starting vertex
    w = end vertex

    if graph search from v-> w == not found:
        return false
    else
        return true

for each edge in E:
    // Once we have n-1 edges, the MST is complete
    if MST->m == (G->n - 1):
        break
    if creates_cycle MST e->tail e->head:
        continue to next edge
    add e to MST

return MST

Union-Find-Based Implementation

This version is more complex, but results in a considerably faster run time ($O((m+n)\log n)$)

Invariants:

Union-Find (U) data structure represents all connected components that have been added to the MST (T)

G = connected, undirected graph
T = empty set (will contain edges in MST)
E = array to hold sorted edges
U = Empty Union-Find Data Structure

Initialize U with all vertices in G
Add all edges in G to E
Sort E by weight in ascending order

for each edge in E:
    // Once we have n-1 edges, the MST is complete
    if |T| == (G->n - 1):
        break
    // If a cycle is detected, move the next edge
    if U->find e->tail == U->find e->head:
        continue to next edge
    else
        U->union e->head e->tail
        add e to T

return T

Source Code

Full Repo

Relevant Files:

Click here for build and run instructions