The substitution method is a technique for solving recurrences. It works in two steps:
- Guess the solution
- Use mathematical induction to verify the guess
This can work very well, especially if we already have some intuition about the problem. Let’s start with a simple example: The Tower of Hanoi. In this classic puzzle, we have three pegs and a number of disks of different sizes which can slide onto any peg. The puzzle starts with the disks in a neat stack in ascending order of size on one peg, with the smallest disk on top. The objective is to move the entire stack to another peg, obeying the following rules:
- Only one disk can be moved at a time
- Each move consists of taking the top disk from one of the stacks and placing it on top of another stack
- No disk may be placed on top of a smaller disk
An algorithm to solve the puzzle goes like this:
- Move \(n-1\) disks from peg 1 to peg 2 using peg 3 as a temporary holding area
- Move the $n$th disk from peg 1 to peg 3
- Move the \(n-1\) disks from peg 2 to peg 3 using peg 1 as a temporary holding area
The number of moves required to solve the Tower of Hanoi puzzle is given by the recurrence relation \(T(n) = 2T(n-1) + 1\) with the initial condition \(T(1) = 1\). We can solve this recurrence using the substitution method.
Our hypothesis might be that \(T(n) \leq c2^n - 1\) for all \(n \geq n_0\), where \(c > 0\) and \(n_0 > 0\). For the base case, we have \(T(1) = c * 2^1 - 1 = c\), so \(c = 1\). Now we need to show that \(T(n) \leq c2^n - 1\) for all \(n \geq n_0\). Then we have:
\begin{align*} T(n) &\leq 2T(n-1) + 1 \\ &\leq 2\left(2^{n-1} - 1\right) + 1 \\ &= 2^n - 2 + 1 \\ &= 2^n - 1 \\ \end{align*}
What if we made a bad guess? Let’s try \(T(n) \leq cn\) for all \(n \geq n_0\). We have \(T(1) = c = 1\), so \(c = 1\). Now we need to show that \(T(n) \leq cn\) for all \(n \geq n_0\). We assume that \(T(k) \leq ck\) for all \(k < n\). Then we have:
\begin{align*} T(n) &\leq 2T(n-1) + 1 \\ &\leq 2c(n-1) + 1 \\ &= 2cn - 2c + 1 \\ \end{align*}
This does not work because \(2cn - 2c + 1 > cn\) for all \(c > 1\). Therefore, our guess was wrong.
Example from CLRS
Determine an asymptotic upper bound for
\[ T(n) = 2T(\lfloor n/2 \rfloor) + \Theta(n). \]
Guess: \(T(n) = O(n \lg n)\)
Inductive hypothesis: \(T(n) \leq cn \lg n\) for all \(n \geq n_0\).
Inductive step: Assume \(T(n) \leq cn \lg n\) for all \(n_0 \leq k < n\). For \(T(\lfloor n/2 \rfloor) \leq c\lfloor n/2 \rfloor \lg \lfloor n/2 \rfloor\), it holds when \(n \geq 2\).
\begin{align*} T(n) &\leq 2T(\lfloor n/2 \rfloor) + \Theta(n) \\ &\leq 2c\lfloor n/2 \rfloor \lg \lfloor n/2 \rfloor + \Theta(n) \\ &= cn \lg (n / 2) + \Theta(n) \\ &= cn \lg n - cn\lg 2 + \Theta(n) \\ &= cn \lg n - cn + \Theta(n) \\ &\leq cn \lg n \end{align*}
Making the Wrong Guess
What if we took the same recurrence and guessed that \(T(n) = O(n)\)?
Guess: \(T(n) = O(n)\)
Inductive hypothesis: \(T(n) \leq cn\) for all \(n \geq n_0\).
Inductive step: Assume \(T(n) \leq cn\) for all \(n \geq n_0\).
\begin{align*} T(n) &\leq 2c\lfloor n/2 \rfloor + \Theta(n) \\ &\leq cn + \Theta(n) \\ \end{align*}
This does not work because \(cn + \Theta(n) > cn\). Therefore, our guess was wrong.