Reference: LeetCode
Difficulty: Medium
My Post: Java Solutions Backtracking (Memoization) + DP with Detailed Exp.
Problem
Given a non-empty string
s
and a dictionarywordDict
containing a list of non-empty words, determine ifs
can be segmented into a space-separated sequence of one or more dictionary words.
Note:
- The same word in the dictionary may be reused multiple times in the segmentation.
- You may assume the dictionary does not contain duplicate words.
Example:
1 | Input: s = "leetcode", wordDict = ["leet", "code"] |
Analysis
Backtracking
First, we put wordSet
into a hash set for quick contains
examination.
For each character S[depth]
, we consider substrings S[depth, i]
including S[depth]
, S[depth, depth + 1]
, …, S[depth, n - 1]
. If one of them is in wordDict
, we then redo the task on character S[i + 1]
; otherwise, return false
.
One of the most difficult part is to write the reject & accept
code. Think about it carefully, we notice that if the recursive call goes into a situation where n == s.length()
, it means the string s
can be successfully segmented; otherwise, it won’t go into that situation. We don’t write reject case in this backtracking function.
1 | public boolean wordBreak(String s, List<String> wordDict) { |
Time: $O(N^N)$ since each time it has at most $N$ choices and the depth (problem size) is $N$. (this is an upper bound
)
Space: $O(N)$ (string length and call stack depth)
Marked For the time complexity, if we count substring
operation ($O(N)$), it would be $O(N \times N^N) = O(N^(N + 1))$.
Backtracking (Memoization)
Let’s use an example to see if we can optimize the above method.
1 | // String: "abcde" | wordDict: ["a", ...] |
For the first substring "a"
, it is in wordDict
, so depth
becomes 1 and we would examine if "bcde"
is breakable.
In the process of checking if "bcde"
is breakable, we may check if "cde", "de", "e"
are breakable. Once we know the answers, we can cache them for future usage no matter they are true or false.
In future when we’ve done processing the first substring "a"
, we will examine "b"
, and you will see there could be a lot of repeated computation for "cde", "de", "e"
.
Difficulty: Using memoization in a backtracking-style recursive function is quite uncommon than other DP memoization. This is the new form I learned. There are three places that we need to set and get memo[]
.
Note: We use Boolean
since initially we want values to be null
.
1 | public boolean wordBreak(String s, List<String> wordDict) { |
Time: $O(N^2)$
Space: $O(N)$
DP
The idea is that given a problem(s) we can divide it into two subproblems s1
and s2
. If both of them are breakable, s
is breakable (by saying breakable I mean it satisfies the required conditions).
Note: Substring s(i, j)
(character i
to j
) in Java is denoted by s.substring(i, j + 1)
.
First, we define our dp[]
array, where dp[i]
is true
if the substring s(0, i - 1)
or s.substring(0, i)
is breakable; otherwise, it should be false
.
Then, we process string length from 1
to n
in dp[i]
. For each substring s(0, i - 1)
, we examine each combination of substrings s(0, j - 1)
(s.substring(0, j)
) and s(j, i - 1)
(s.substring(j, i)
). The first subproblem can be calculated directly by dp[j]
while the second one can be checked by wordDict
set. Question: Why don’t we break the second substring and examine it further? (e.g. abc
is not in wordDict
, but ab
and c
could be in wordDict
)
1 | // String: a b c d e f |
The question is: what happen if "cdef"
is not in wordDict while "cd"
and "ef"
are both in wordDict
?
It is handled previously! If "cd"
is in wordDict
, in the previously fourth round for substring "abcd"
, we would examine "ab"
and "cd"
. If dp("ab")
is true and "cd"
is in wordDict
, we would mark dp("abcd")
as true!
Then in the final round, dp("abcd")
is true and "ef"
is in wordDict
, so we have the whole string breakable.
Note: In addition, please think about the initialization step
and the break statement
in the code.
1 | public boolean wordBreak(String s, List<String> wordDict) { |
Time: $O(N^3)$ since substring
takes $O(N)$.
Space: $O(N)$
BFS
Go To: LeetCode Solution