Reference: LeetCode EPI 15.9
Difficulty: Medium

My Post: Summary of All Solutions in Java with Explanations

Problem

Given $n$, how many structurally unique BST’s (binary search trees) that store values 1 ... n?

Example:

1
2
3
4
5
6
7
8
9
10
Input: 3
Output: 5
Explanation:
Given n = 3, there are a total of 5 unique BST's:

1 3 3 2 1
\ / / / \ \
3 2 1 1 3 2
/ / \ \
2 1 2 3

Analysis

DP (Recursion)

Reference: LeetCode Solution

Given a sequence $1, 2, \ldots, n$, we enumerate each number i in the sequence and take it as the root to form binary trees.

We define two functions:

  • $G(n)$: the number of unique BST for a sequence of length n (number of nodes).
  • $F(i, n)$: the number of unique BST, where the number i (1 <= i <= n) is the root.

We construct $G(n)$ by the sum of $F(i, n)$:

$$G(n) = \sum^n_{i=1}F(i, n) = F(1, n) + F(2, n) + \ldots + F(n, n)$$

Notice that when we select i as a root i.e. $F(i, n)$, we have i - 1 nodes which can be used to form a left subtree; similarly we have n - i nodes to form a right subtree.

$$F(i,n) = G(i - 1) \times G(n - i)$$

Thus, $F(i, n)$ can be calculated by the product of the number of unique BST with i - 1 nodes and the number of unique BST with n - i nodes. Uniqueness is guaranteed by the sizes of the left subtree and the right subtree.

Particularly, consider two base cases when i = 1 and i = 2:

  • i = 1: $F(1, n) = G(0) \times G(n - 1)$. The empty left subtree is still a subtree, so $G(0) = 1$.
  • i = 2: $F(2, n) = G(1) \times G(n - 2)$. With one node we can only construct one unique left subtree, so $G(1) = 1$.

Finally, we have the recurrence:

$$G(n) = \sum^n_{i=1}F(i,n) = \sum^n_{i=1} G(i - 1) \times G(n - i)$$
$$\text{where}\ \ G(0)=1, G(1)=1$$

Here is the code without memoization.

1
2
3
4
5
6
7
8
9
10
11
// numTress(n) is G(n)
public int numTrees(int n) {
if (n == 0 || n == 1) {
return 1;
}
int sum = 0;
for (int i = 1; i <= n; ++i) {
sum += numTrees(i - 1) * numTrees(n - i);
}
return sum;
}

Here is the recurrence tree:

1
2
3
4
5
6
7
                               G(4)
/ | | \
G(0)G(3) G(1)G(2) G(2)G(1) G(3)G(0) // 4
/ | \
G(0)G(2) G(1)G(1) G(2)G(0) // 4 x 3
/ \
G(0)G(1) G(1)G(0) // base case // 4 x 3 x 2

$C(N) = N \times N!$

Time: It is at most bounded by $O(N \times N!)$. A tighter bound would be Catalan number times $N$ since we’ve done $N$ times, which is $N \times G_N = O(N\times\frac{4^N}{N^{3/2}}) = O(\frac{4^N}{N^{1/2}})$.
Space: $O(N)$

Examining the recurrence carefully, we find that there are repeated calculations.

1
2
3
4
5
6
7
8
G(3) = G(0) x G(2)  // i = 1
= G(1) x G(1) // i = 2
= G(2) x G(0) // i = 3

G(4) = G(0) x G(3) // i = 1
= G(1) x G(2) // i = 2
= G(2) x G(1) // i = 3
= G(3) x G(0) // i = 4

Therefore, we can use a hash map or an integer array to store calculated $G(n)$. Here is the hash map version.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Map<Integer, Integer> map = new HashMap<>();
public int numTrees(int n) {
if (n == 0 || n == 1) {
return 1;
}
int sum = 0;
for (int i = 1; i <= n; ++i) {
if (!map.containsKey(i - 1)) {
map.put(i - 1, numTrees(i - 1));
}
if (!map.containsKey(n - i)) {
map.put(n - i, numTrees(n - i));
}
sum += map.get(i - 1) * map.get(n - i);
}
return sum;
}

Here is the recurrence tree:

1
2
3
4
5
6
7
                               G(4)
/ | | \
G(0)G(3) G(1)G(2) G(2)G(1) G(3)G(0)
/ | \
G(0)G(2) G(1)G(1) G(2)G(0)
/ \
G(0)G(1) G(1)G(0) // base case

Note: Without memoization, the time complexity is upper bounded by $O(N \times N!)$.

By calculating the leftmost nodes, we have $G(0), G(1), \ldots, G(N)$, which takes $O(N)$ time. Besides, we have to do product computations at each level, which takes $2 + 3 + 4 + \ldots + N = O(N^2)$ time in total.

Time: $O(N^2)$
Space: $O(N)$ because of call stacks.

DP (Iteration)

By observation, we can construct our solution by a bottom-up approach.

The recurrence formula: $G(n) = \sum^n_i G(i - 1) \times G(n - i)$

For example, $G(4) = G(0) \times G(3) + G(1) \times G(2) + G(2) \times G(1) + G(3) \times G(0)$

1
2
3
4
5
6
G(0) G(1)
// With these initially two base-case values, we can calculate G(2)
G(0) G(1) G(2)
// With these three values we have G(3)
G(0) G(1) G(2) G(3)
// ...

Note: Notice that it is G[i - j] instead of G[n - j] and it is j <= i instead of j <= n.

1
2
3
4
5
6
7
8
9
10
public int numTrees(int n) {
int[] G = new int[n + 1];
G[0] = 1; G[1] = 1; // init
for (int i = 2; i <= n; ++i) {
for (int j = 1; j <= i; ++j) { // sum
G[i] += G[j - 1] * G[i - j];
}
}
return G[n];
}

Time: $O(N^2)$
Space: $O(N)$

Mathematical Deduction

The sequence of $G(n)$ function results is known as Catelan number $C_n$.

$$C_0 = 1,\ \ \ C_{n+1} = \frac{2(2n+1)}{n+2}C_n$$

Note: Use long type.

1
C = 2 * (2 * i + 1) / (i + 2) * C;

If we put C at the end of the statement, the result is not correct. Do all multiplications first! For example, when i = 2 and C_2 = 2, we would have:

  • C = 2 * (2 * 2 + 1) / (2 + 2) * C_2 = 2 * (5) / 4 * 2 = 10 / 4 * 2 = 4
  • instead of C = 2 * 10 / 4 = 5.
1
2
3
4
5
6
7
public int numTrees(int n) {
long C = 1;
for (int i = 0; i < n; ++i) { // i stops at n - 1
C = C * 2 * (2 * i + 1) / (i + 2); // calculate C_i+1
}
return (int) C;
}

Time: $O(N)$
Space: $O(1)$