Skip to main content

Binary Search Trees

TL;DR

Binary Search Trees maintain left < parent < right invariant enabling O(log n) average search, insert, and delete. Inorder traversal yields sorted order. Balanced BSTs (AVL, Red-Black) guarantee O(log n) worst-case via rotations. Unbalanced BSTs degrade to O(n) linked lists. Key operations: search recursively, insert at leaf maintaining property, delete by replacing with successor/predecessor. Understanding BST structure reveals why balanced variants matter for guaranteed performance.

Core Concepts

BST Property and Validation

BST Property: For every node, all left descendants < node < all right descendants. Critical: less-than must hold for entire subtree, not just children.

Validation recursively checks property. For each node, verify that all left subtree values < node.val and all right subtree values > node.val. Track min/max bounds during traversal.

Inorder traversal of valid BST yields sorted sequence. This property enables efficient sorted queries without additional sorting.

Uniqueness: Multiple BSTs can contain same data in different structures. Balanced vs unbalanced trees demonstrate this.

BST Operations: Search, Insert, Delete

Search recursively narrows search space. At node, if target < node, go left; if target > node, go right; if equal, found. O(log n) average on balanced tree, O(n) worst case on skewed tree.

Insert starts at root, recursively finds position where new node belongs. Inserted as leaf. Updates parent's pointer. Maintains BST property without reordering existing nodes.

Delete handles three cases:

  1. Leaf: simply remove
  2. One child: replace node with child
  3. Two children: find inorder successor (minimum of right subtree), replace node's value with successor's value, delete successor

Successor/Predecessor: inorder successor is minimum value in right subtree; inorder predecessor is maximum value in left subtree. Used for deletion and range queries.

Balancing and Tree Rotations

Balanced tree maintains height O(log n) through rotations. AVL trees self-balance via single/double rotations. Red-Black trees use color-based balancing.

Left rotation moves right child up; parent becomes left child. Opposite for right rotation. Rotations preserve BST property.

Why balance matters: unbalanced insertion (sorted sequence into empty BST) creates linked list structure, degrading search to O(n). Balancing prevents this.

Key Algorithms and Techniques

Compare target to node, recurse left or right.

Insertion Maintaining Property

Find leaf position recursively, insert there.

Deletion with Successor Replacement

For two-child case, find successor, swap values, recurse delete on right subtree.

Inorder Traversal for Sorted Output

Left, node, right visits in sorted order.

Practical Example

class BST:
def __init__(self, val=None):
self.val = val
self.left = None
self.right = None

def search(self, target):
if self.val is None:
return False
if target == self.val:
return True
if target < self.val:
return self.left.search(target) if self.left else False
return self.right.search(target) if self.right else False

def insert(self, val):
if self.val is None:
self.val = val
return
if val < self.val:
if self.left:
self.left.insert(val)
else:
self.left = BST(val)
else:
if self.right:
self.right.insert(val)
else:
self.right = BST(val)

def find_min(self):
return self.val if not self.left else self.left.find_min()

def find_max(self):
return self.val if not self.right else self.right.find_max()

def inorder(self, result=None):
if result is None:
result = []
if self.val is not None:
if self.left:
self.left.inorder(result)
result.append(self.val)
if self.right:
self.right.inorder(result)
return result

def delete(self, val):
if self.val is None:
return None
if val < self.val:
if self.left:
self.left = self.left.delete(val)
elif val > self.val:
if self.right:
self.right = self.right.delete(val)
else: # Found node to delete
if not self.left:
return self.right
if not self.right:
return self.left
# Two children: find successor
self.val = self.right.find_min()
self.right = self.right.delete(self.val)
return self

# Test
bst = BST()
for val in [5, 3, 7, 2, 4, 6, 8]:
bst.insert(val)
print(f"Inorder: {bst.inorder()}")
print(f"Search 4: {bst.search(4)}")
print(f"Min: {bst.find_min()}, Max: {bst.find_max()}")

Common Pitfalls

warning

Validation missing entire subtree check: Must verify all descendants < parent, not just children. Use min/max tracking.

Inorder successor confusion: Successor is minimum of right subtree, not right child. Missing this causes incorrect deletion.

Not maintaining BST property on insert: Inserting at wrong position breaks property. Always recurse left for < or right for ≥.

Forgetting pointer updates on deletion: Deleting node means updating parent's pointer to deleted node's child. Missing this disconnects tree.

Assuming BSTs are balanced: O(log n) is average case. Sorted insertions create linked lists. Use balanced variants (AVL, RB) for guaranteed O(log n).

Advanced BST Topics

Self-Balancing Trees: AVL Trees

AVL trees maintain balance through rotations:

class AVLNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
self.height = 1

class AVLTree:
def insert(self, node, val):
"""Insert with automatic balancing."""
if not node:
return AVLNode(val)

if val < node.val:
node.left = self.insert(node.left, val)
else:
node.right = self.insert(node.right, val)

# Update height
node.height = 1 + max(self.height(node.left), self.height(node.right))

# Check balance factor
balance = self.balance_factor(node)

# Left heavy
if balance > 1:
if val < node.left.val:
return self.rotate_right(node)
else:
node.left = self.rotate_left(node.left)
return self.rotate_right(node)

# Right heavy
if balance < -1:
if val > node.right.val:
return self.rotate_left(node)
else:
node.right = self.rotate_right(node.right)
return self.rotate_left(node)

return node

def rotate_right(self, y):
"""Right rotation: y becomes right child of x."""
x = y.left
y.left = x.right
x.right = y
y.height = 1 + max(self.height(y.left), self.height(y.right))
x.height = 1 + max(self.height(x.left), self.height(x.right))
return x

def rotate_left(self, x):
"""Left rotation: x becomes left child of y."""
y = x.right
x.right = y.left
y.left = x
x.height = 1 + max(self.height(x.left), self.height(x.right))
y.height = 1 + max(self.height(y.left), self.height(y.right))
return y

def balance_factor(self, node):
if not node:
return 0
return self.height(node.left) - self.height(node.right)

def height(self, node):
if not node:
return 0
return node.height

BST Use Cases: Range Queries

Find all elements between two values:

def range_query(root, low, high):
"""Find all values in range [low, high]."""
result = []

def inorder_range(node):
if not node:
return

# Explore left if node value > low
if node.val > low:
inorder_range(node.left)

# Include if in range
if low <= node.val <= high:
result.append(node.val)

# Explore right if node value < high
if node.val < high:
inorder_range(node.right)

inorder_range(root)
return result

# Example: Find all values between 30 and 70
# 50
# / \
# 30 70
# / \ / \
# 20 40 60 80
#
# range_query(root, 30, 70) → [30, 40, 50, 60, 70]

K-th Smallest Element

Find the k-th smallest efficiently:

class BSTWithRank:
"""BST that tracks subtree sizes for O(log n) k-th element."""

def __init__(self, val):
self.val = val
self.left = None
self.right = None
self.size = 1 # Size of subtree

def insert(self, val):
"""Insert and maintain size."""
if val < self.val:
if not self.left:
self.left = BSTWithRank(val)
else:
self.left.insert(val)
else:
if not self.right:
self.right = BSTWithRank(val)
else:
self.right.insert(val)
self.size += 1

def kth_smallest(self, k):
"""Find k-th smallest element in O(log n)."""
# How many elements in left subtree?
left_size = self.left.size if self.left else 0

if k <= left_size:
# k-th is in left subtree
return self.left.kth_smallest(k)
elif k == left_size + 1:
# Current node is k-th
return self.val
else:
# k-th is in right subtree
return self.right.kth_smallest(k - left_size - 1)

# Build tree
bst = BSTWithRank(50)
for val in [30, 70, 20, 40, 60, 80]:
bst.insert(val)

print(bst.kth_smallest(3)) # 40 (sorted: 20, 30, 40, 50, 60, 70, 80)

Comparing BST Variants

                 Insertion   Search   Deletion   Space
Unbalanced BST: O(log n)* O(log n)* O(log n)* O(n)
O(n) worst O(n) worst O(n) worst
AVL Tree: O(log n) O(log n) O(log n) O(n)
(rotations)
Red-Black Tree: O(log n) O(log n) O(log n) O(n)
(color rebalancing)
B-Tree (databases): O(log n) O(log n) O(log n) O(n)
(optimized for disk)

*: Average case. Worst case is O(n) for unbalanced.

Practical Performance Analysis

For 1 million elements:

Unbalanced BST (worst case, sorted insertion):
- Search: ~1,000,000 comparisons
- Memory cache misses: High (pointer chasing)

Balanced BST (AVL, Red-Black):
- Search: ~20 comparisons
- Memory cache misses: Medium (still tree structure)

Hash Table:
- Search: ~1 comparison (average)
- Memory cache misses: High (hash computation)

B-Tree (database index):
- Search: ~3-4 disk accesses
- Memory cache misses: Low (branching factor ~100)

Self-Check

  1. Trace insertion sequence into empty BST. Why does order matter for balance? Order determines tree shape. Sorted insertion creates skewed tree (O(n) search); random insertion creates balanced tree (O(log n) search).

  2. Explain inorder successor finding. Why specifically minimum of right subtree? Inorder successor is smallest element larger than current. For two-child deletion, replacing with successor maintains BST property.

  3. How does BST property enable search efficiency? When does this break? BST property eliminates half of remaining elements per comparison (binary search). Breaks when tree is unbalanced (skewed like linked list).

  4. When should you use AVL vs. Red-Black trees? AVL is stricter (more balanced, fewer rotations needed); Red-Black is more practical (fewer rebalancing operations). Both O(log n) worst-case.

  5. How do you find k-th smallest efficiently? Augment each node with subtree size. Compare k with left subtree size to navigate directly. O(log n) instead of O(n).

One Takeaway

info

Binary search trees elegantly organize data for efficient searching through recursive division. Inorder traversal yields sorted order directly. Understanding deletion via successor replacement and why balancing matters reveals the depth of tree algorithm design. For production systems, use self-balancing variants (AVL, Red-Black, B-Trees) to guarantee O(log n) performance.

References

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
  • McDowell, G. L. (2015). Cracking the Coding Interview (6th ed.). CareerCup.