Prerequisites (click to view)
graph LR
ALG(["fas:fatrophy Self Balancing Binary Tree fas:fatrophy #160;"])
ASY_ANA(["fas:facheck Asymptotic Analysis #160;"])
click ASY_ANA "./mathasymptoticanalysis"
BT(["fas:facheck Binary Tree #160;"])
click BT "./listdatastructbinarytree"
ASY_ANA>ALG
BT>ALG
The previous section on Binary Trees took a few liberties concerning asymptotic complexity. The purpose was to meter the amount of new concepts introduced; however, with the introductory concepts out of the way it’s time to correct the oversight. While most operations were advertised as $O(\log_{2}n)$, in reality, this is only true if the tree is perfectly balanced. A balanced tree has an equal number of children to the left and right of each node^{1} as illustrated in the binary search tree of integers below. A more accurate measure of a tree’s asymptotic complexity is derived from the tree’s height.
History
After the invention of binary trees, it didn’t take long for selfbalancing trees to evolve. Two Russian mathematicians, Georgy AdelsonVelsky and Evgenii Landis, published a paper entitled An Algorithm for the Organization of Information in 1962^{2} which outlined a technique for maintaining balance. The resulting data structure was named AVL tree after the initials of the inventors’ last names. AVL trees are still commonly used today.
Tree Height
A more accurate measure of a tree’s asymptotic complexity is derived from the tree’s height. The height of a tree is the number of nodes from the root to the most distant leaf. See the image below for a graphical reference.
Using the previously defined search operation, calculate the worst case number of comparisons required to locate a node in both trees shown above. Take a few minutes to come up with an answer before reading on.
Even though the balanced tree has three times more items, there are fewer requite comparisons (4 for the balanced tree and 5 for the unbalanced tree). The disparity compounds with more nodes. Stated more rigorously, the search operation has a runtime of $O(\text{height})$. Although this seems in opposition to the assertion that search is an $O(\log_{2}n)$ operation for a balanced search tree, consider the maximum height. Notice how the bottom row of nodes has exactly twice as many items as the row above it. This patterns continues all the way to the root. A number repeatedly divided by 2 is $\log_2$. Therefore, the height of a balanced search tree is $\log_2{n}$.
Conversely, the height of an completely unbalanced tree is $n$. $O(\text{height})$ works out to logarithmic ($O(\log_{2}n)$) for a balanced tree and linear ($O(n)$) for an unbalanced tree. As a reminder of the significant difference between the two complexity classes, see the comparison table below.
Input size $n$  $O(n)$ linear  $O(log_2{n})$ logarithmic 

10  10  4 
100  100  7 
1000  1000  10 
10000  10000  14 
100000  100000  17 
1000000  1000000  20 
10000000  10000000  24 
Perusal of the binary search tree pseudo code reveals that all the operations have a traversal mechanism similar to search. This means that all operations become linear for an unbalanced tree. As is hopefully clear by now, balance is a major concern with trees.
Maintaining Balance
Luckily, there are many welldefined tree data structures with builtin selfbalance mechanisms. These are know as self balancing search trees. Examples of such trees include, but are not limited to:
 AVL trees
 Redblack trees
 Splay trees
 AA trees
 Btrees
 23 trees
 234 trees
 Scapegoat trees
 Treaps
 WeightBalanced trees
 etc…
Many of the examples listed above are simply enhanced binary trees that cleverly rearrange nodes in order to maintain balance. However, some are more exotic, such as Btrees which accommodate many keys per node. Each variation has slightly different runtime characteristics and utility. It’s not practical to cover all variations owing to the breadth of the topic. One could write a book solely on tree data structures. In the event that the reader has a performance critical application for trees^{3} it’s highly recommended that they delve deeper into this topic.
This begs the question, if there are trees capable of self balancing, why would anyone use a standard Binary Tree? The answer is that there is overhead associated with maintaining balance in the form of both complexity and clock cycles^{4}. Self balancing ensures optimal tree traversals at the cost of less than optimal delete and insert operations^{5}. Therefore, the ratio of insert/delete to traversal operations is a consideration. In cases where data is inserted in random order, statistically, a standard binary tree will be balanced regardless^{6}. As with most cases in computer science, the particular circumstances determine the optimal solution. There is no one right answer.
The next obvious question is how exactly do self balancing trees maintain balance? While each implementation is unique, rotations are a prominent technique.
Rotations
Conceptually, a rotation is a procedure for rewriting node pointers so that a parent node trades places with one of it’s children in order to to reduce the overall height of the tree. A left rotation swaps the parent with its right sibling while a right rotation swaps the parent with its left sibling. Rotations are difficult to put in writing but easy to represent graphically. Consider the simplest possible rotation scenario: a tree with only three nodes as depicted below.
At a minimum, a rotation consists of rewriting two pointers. For instance, the left rotation shown above sets the rotate node’s right pointer to NULL and changes the right child’s left pointer to the address of the rotated node^{7}. Of course, things get more complicated when there are child nodes to contend with. Consider a left rotation as illustrated below.
Take a minute to examine the image. In order to rotate the root node (50) to the left, three things must happen:
 The parent of the rotation node (50) becomes the parent of the right node (75). In this case, NULL.
 The rotation node’s (50) right pointer is updated to the right node’s left pointer
 The right node’s (75) left pointer changes to the rotation node (50)
A similar process occurs when rotating right. In this example, a branch node (75) is rotated. Consult the image below.
The steps for right rotation are similar to that of the left rotation.
 The parent (50) of the rotation node (75) becomes the parent of the left node (61).
 The rotation node’s (75) left pointer is updated to the left node’s right pointer (NULL)
 The left node’s (61) right pointer changes to the rotation node (75)
Self balancing isn’t entirely intuitive and it takes time to internalize the concept. The reader is highly encouraged to master rotations before advancing as it is assumed knowledge in the next section on redblack trees. As previously stated, a though treatment of selfbalancing implementations is out of scope, However, redblack trees are a representative sample of self balancing implementations. They are popular and the concepts are easily translatable to other tree variations.
Exercises

Assuming you have a binary search tree with 1074 nodes, what is height given:
a. A selfbalancing tree having items inserted in sequential order
b. A nonselfbalancing tree having items inserted in random order
c. A nonselfbalancing tree having items inserted in sequential orderAnswers (click to expand)
 The tree will remain balanced so the height is $\log_{2}1074 \approx10.06877$ which rounds up to $11$.
 Inserting nodes in random order statitically produces a balanced tree. The height will be close to $\log_{2}1074 \approx10.06877$ which rounds up to $11$.
 Inserting items sequentially with no balancing mechanism produces a tree where all the left or right pointers are NULL. The height is equal to the number of nodes, in this case 1074.

Calculate the height of a nonselfbalancing binary search tree of integers given that nodes are inserted in the order specified below. The first node added to the tree becomes the root.
a. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
b. 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1
c. 8, 4, 2, 1, 3, 6, 5, 7, 12, 10, 9, 11, 14, 13, 15
d. 12, 4, 1, 2, 3, 6, 5, 11, 7, 10, 8, 9, 15, 14, 13Answers (click to expand)

Starting with the binary search tree of integers depicted in the image below, describe a single rotation operation that will reduce the tree’s height to 3.
Answers (click to expand)
Left rotate the 61 node.

Define pseudo code for both a left and right binary tree node rotation. Assume the functions accept an existing binary tree node and the binary tree is tracking parent pointers^{8}^{9}.
For formatting purposes, the answers for question 4 are located here.

More accurately, it’s not possible to reduce the height of a balanced tree. This concept is explained in the succeeding section. ↩

Available online at https://zhjwpku.com/assets/pdf/AED210avlpaper.pdf ↩

For example, implementing a database. ↩

Recall the Principal of Parsimony ↩

Many selfbalancing trees only add only constant time to insert and delete operations rending an identical asymptotic complexity. However, recall from Real World Asymptotic Complexity that constants do matter. Evaluate your use case carefully to determine if it matters for your particular context. ↩

Assuming a uniform distribution ↩

Depending on the implementation, parent nodes may also need to be rewired along with the pointer that specifies root. ↩

The purpose of parent pointers is explained on the Binary Tree page. This exercise is meant to reinforce that knowledge. ↩

Don’t fret if the answer isn’t entirely obvious; a neophyte can expect to invest upwards of three hours in a complete solution. This is by design. The only way to fully comprehend rotations is via experimentation and contemplation. A mountain of prose wont make the concept any more clear.
Many find it helpful to draw a visual representation of the operation before specifying pseudo code. Instead of starting from scratch, try adding parent pointers to the supplied drawings. ↩