Prerequisites (click to view)
graph LR
ALG(["fas:fatrophy Red Black Tree fas:fatrophy #160;"])
ASY_ANA(["fas:facheck Asymptotic Analysis #160;"])
click ASY_ANA "./mathasymptoticanalysis"
BT(["fas:facheck Binary Tree #160;"])
click BT "./listdatastructbinarytree"
SBT(["fas:facheck Self Balancing Binary Tree #160;"])
click SBT "./listdatastructselfbaltree"
ASY_ANA>ALG
BT>SBT>ALG
Redblack trees are one of the many strains of Self Balancing Binary Trees. It’s not practical to cover all variations owing to the breadth of topic. However, redblack balancing techniques are representative of the whole. A conception of this data structure will equip one with the mental tools necessary to pursue other alternatives.
A redblack tree is an enhanced binary search tree designed to approximately minimize the height of the tree. The changes required to transform a binary search tree into a redblack tree are strictly additive. That is to say, a redblack tree is a standard binary search tree that maintains additional invariants. Before diving into these invariants, it’s prudent to define what the colors red and black denote.
As one might infer from the name, every redblack tree node has a red or black color designation. This is often confusing upon first encounter because it’s nonsensical to literally color a tree node. However, red or black is nothing more than a binary state designation^{1}. It’s possible to replace red with pirate and black with ninja and achieve the same result. The intent is to store a bit on every node indicating one of two possible states. To solidify this concept, the image below illustrates the memory layout of a node from a redblack tree of 64bit integers^{2}.
History
The Xerox PARC company is famous for many groundbreaking inventions^{3}, the most notable of which is the Graphical User Interface (GUI). It’s not surprising that many algorithms also find their origins there. Two Xerox employees, Robert Sedgewick and Leonidas J. Guibas, published a paper in 1978 entitled A Dichromatic Framework for Balanced Trees^{4} which introduced redblack trees to the world. To borrow an aphorism from Issac Newton, Sedgewick an Guibas were “standing on the shoulders of giants”. They attribute much of their work to Rudolf Bayer’s 1972 paper entitled Symmetric binary BTrees: Data structure and maintenance algorithms^{5}.
RedBlack Invariants
As mentioned above, a redblack tree is a standard binary search tree that maintains additional invariants, four to be exact. They are listed below.
 Every node is designated as either red or black
 The root node is always black
 Red nodes must have black children and parents^{6}
 Every path starting at the root node and ending in a
NULL
pointer^{7} must pass through the same number of black nodes
It’s not entirely obvious how maintaining these invariants minimizes tree height. Consider the simplest possible case: a tree with three nodes. It’s not possible to arrange the nodes in such a way that it is both unbalanced and also a valid redblack tree as illustrated below^{8}.
Every insert/delete operation has the potential to invalidate the invariants; therefore, they must be evaluated and restored. The sections below outline the exact algorithms is more detail, but at a high level invariants are restored via a series of recolor and rotation procedures. As an example, consider the insert operation depicted below.
As the number of nodes increase, maintaining the invariants does NOT guarantee perfect balance. This may seems like a oversight but it’s actually a feature.
Close Enough Balance
An interesting feature of redblack trees is that they do not attempt to maintain perfect balance. For instance, consider the two valid redblack trees depicted below. Both trees contain the exact same values and satisfy all invariants yet the bottom tree is more balanced than the top.
The relaxed balancing constraint results in slightly less efficient search operations and considerably more efficient insert/delete operations owning to fewer requite rotations. In practice, the search performance deficit in negligible. A perfectly balanced tree has $\text{height} = \log_2{n}$ while redblack trees guarantee $\text{height} \leq 2\log_2{n + 1}$. To provide a sense of the height difference, the table below compares the two functions.
Tree size $n$  $\log_2{n}$  $2\log_2{n+1}$ 

10  4  7 
100  7  14 
1000  10  20 
10000  14  27 
100000  17  34 
1000000  20  40 
10000000  24  47 
Stated a bit more intuitively, the maximum difference between the depth of any two leaves is one for a perfectly balanced tree and two for a red black tree. Next, we’ll examine an insert operation.
Insert Operations
One could examine the pseudo or source code to understand redblack tree insert operations. However, they are a bit abstruse without some additional commentary. At a high level, the insertion algorithm has four steps:
 Insert using the Binary Tree insert algorithm
 Color the new node red
 If the new node’s parent is black, no invariants are broken so terminate
 Restore redblack tree invariants
The first three steps are selfexplanatory while the fourth is fairly complex; therefore, it is the focus for the remainder of this section. Restoring the redblack invariants is a recursive process starting at the inserted node. For each node the algorithm encounters, there are three possible scenarios:
 The parent node is black
 The uncle node is red
 The uncle node is black or
NULL
The actions taken for each scenario are outlined below.
 The parent node is black
 if root node is red, recolor it to black
 terminate
 The uncle node is red
 recolor parent node black
 recolor uncle node black
 recolor grandparent node red
 recurse on grandparent node
 The uncle node is black or
NULL
 parent node is a left branch
 node is a right branch
 rotate parent left
 recurse on parent
 node is a left branch
 change parent node color to black
 change grandparent node color to red
 rotate grandparent node right
 recurse on node
 node is a right branch
 parent node is a right branch or
NULL
 node is a left branch
 rotate parent node right
 recurse on parent node
 node is a right branch
 change parent node color to black
 change grandparent node color to red
 rotate grandparent node left
 recurse on node
 node is a left branch
 parent node is a left branch
Don’t worry if this seems a bit overwhelming. It’s normal to review redblack tree insertions several times before comprehension sets in. It’s also difficult to fathom without concrete examples. The image below steps through the process graphically.
In order to keep these concepts as accessible as possible, a few arcane details were omitted. However, the pseudo code as well as accompanying source code are rigorously complete and should be studied assiduously. With a firm conceptual understanding, the code should be much more accessible.
Applications
A redblack tree is direct replacement for any standard binary search tree; therefore, they have the same applications. Below are a few notable instances of redblack trees.
 The Linux Completely Fair Scheduler (CFS)
 C++ STL Class map
Pseudo Code
As mentioned above, the changes required to transform a binary search tree into a redblack tree are strictly additive. All of the binary tree pseudo code as well as the rotation pseudo code apply here and is not repeated. The additive code responsible for maintaining the invariants after an insert operation is shown below. The delete operation is an exercise.
\begin{algorithm}
\caption{RedBlack Tree}
\begin{algorithmic}
\REQUIRE T = redblack tree
\REQUIRE $\text{new_node}$ = node to insert into the tree
\REQUIRE $\text{NULL_NODE}$ = global variable representing a black leaf node (see implementation details)
\FUNCTION{insert}{T, $\text{new_node}$}
\STATE \CALL{binaryTreeInsert}{T, $\text{new_node}$}
\STATE $\text{new_node}$.left $\gets \text{NULL_NODE}$
\STATE $\text{new_node}$.right $\gets \text{NULL_NODE}$
\STATE $\text{new_node}$.color $\gets$ RED
\STATE
\STATE node $\gets \text{new_node}$
\WHILE{node.parent.color = RED}
\IF{node.parent = node.grandparent.left}
\STATE uncle $\gets$ node.grandparent.right
\IF{uncle.color = RED}
\STATE uncle.color $\gets$ BLACK
\STATE node.parent.color $\gets$ BLACK
\STATE node.grandparent.color $\gets$ RED
\STATE node $\gets$ node.grandparent
\ELSEIF{node = node.parent.right}
\STATE node $\gets$ node.parent
\STATE \CALL{rotateLeft}{node}
\ELSE
\STATE node.parent.color = BLACK
\STATE node.grandparent.color = RED
\STATE \CALL{rotateRight}{node.grandparent}
\ENDIF
\ELSE
\STATE uncle $\gets$ node.grandparent.left
\IF{uncle.color = RED}
\STATE uncle.color $\gets$ BLACK
\STATE node.parent.color $\gets$ BLACK
\STATE node.grandparent.color $\gets$ RED
\STATE node $\gets$ node.grandparent
\ELSEIF{node = node.parent.left}
\STATE node $\gets$ node.parent
\STATE \CALL{rotateRight}{node}
\ELSE
\STATE node.parent.color $\gets$ BLACK
\STATE node.grandparent.color $\gets$ RED
\STATE \CALL{rotateLeft}{node}
\ENDIF
\ENDIF
\ENDWHILE
\STATE
\STATE T.root.color $\gets$ BLACK
\ENDFUNCTION
\end{algorithmic}
\end{algorithm}
Implementation Details
Typically, trees terminate with a
NULL
pointer. Replacing them with a BLACK node representing aNULL
value simplifies the logic. Without this, every node reference (left, right, parent, grandparent, uncle) must have branches to accommodateNULL
values. The pseudo code above, as well as the supplied source code, implement a fifth invariant: allNULL
pointers are BLACK nodes.An added optimization is to create a single global
NULL
node to avoid reallocating every requisiteNULL
node. This is depicted in the image below.
Asymptotic Complexity
Please refer to Binary Tree Asymptotic Complexity. The additive changes required to transform a binary tree to a redblack tree, while not free, do not change the asymptotic complexity. It’s common to question how is it that rebalancing does not increase the complexity of the insert operation. The balance algorithm is $O(\log_2{n})$. Adding this to the insert operation yields $O(\log_2{n} + \log_2{n})$ or more simply $O(2\log_2{n})$. Eliminating the constants results in $O(\log_2{n})$. The search operation is similar, while it’s technically $O(2\log_2{n + 1})$, removing the constants leave us with $O(\log_2{n})$.
Advantages
 Binary Tree: Because a redblack tree is essentially an enhanced binary tree, it has the same advantages.
 Search: Guaranteed to remain balanced regardless of insertion order.
Disadvantages
 Binary Tree: Because a redblack tree is essentially an enhanced binary tree, it has the same disadvantages.
 Complexity: Added complexity.
Source Code
Relevant Files:
Click here for build and run instructions
Exercises

Answer True or False for each of the following statements:
a. Redblack tree invariants ensure that a tree is perfectly balanced.
b. A binary tree can be transformed into a redblack without modifying existing code.
c. The colors red and black are intrinsic to the data structure.Answers (click to expand)
 False: Redblack trees invariants guarantee approximate balance where $\text{height} \leq 2\log_2{n + 1}$.
 True: The changes required to transform a binary tree to a redblack tree are strictly additive.
 False: Any binary state designation would suffice.

Identify the violated invariant, if any, in each of the following redblack trees.
Answers (click to expand)
 The forth invariant, same number of black per path, is violated. There are three black nodes on the path from the `6` to the `5` while there are only two black nodes on every other path.
 The third invariant, no consecutive red nodes, is violated. `5` is a red node and it has a red child.
 The second invariant, root is black, is violated.
 There are no invariant violations: this is a valid redblack tree.

Specify the steps required to insert the value 40 into the redblack tree depicted below.
Answers (click to expand)

Specify the pseudo code for rebalancing a redblack tree after a delete operation^{9}.
For formatting purposes, the answer to question 4 is located here.

The inventors, Robert Sedgewick and Leonidas J. Guibas, choose red and black to designate nodes for no other reason than that seventiesera printers produced ascetically pleasing prints using those colors. ↩

Modern machines optimize data structures by aligning values on word boundaries resulting in padding (sometimes referred to as slack) as illustrated in the graphic. While this consumes more space, it provides a notable speed optimization. The details are beyond the scope of this work. Please consult section 3.9.3 of Computer Systems: A Programmer’s Perspective (3rd Edition) by Randal E. Bryant and David R. O’Hallaron for a full treatment of the topic. ↩

…or discoveries depending on your philosophical predilection. Do scientific concepts exist before they are documented? If you believe they spring into existence in the human mind, then they are inventions. If you believe they are transcendent truths, they are discoveries. In practice, the distinction isn’t significant. ↩

Available online at https://ieeexplore.ieee.org/document/4567957 ↩

Available online at https://link.springer.com/article/10.1007%2FBF00289509 ↩

Otherwise stated, consecutive red nodes are prohibited. ↩

The path that traverse operations take through the tree when searching for a value not in the tree. ↩

The image does not provide an exhaustive account of every invalid variation, the reader is invited to extrapolate. ↩

This question is intentionally difficult. Do not be discouraged if you are unable to come up with a correct answer even after many hours. The intent is to force the reader into deep contemplation. This is the only way to truly master the content. As is typical in life, the journey is more valuable than reaching the destination. Be patient, and try to come up with a solution before looking at the answer. ↩