Description
Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.
|
|
Note:
- s could be empty and contains only lowercase letters a-z.
- p could be empty and contains only lowercase letters a-z, and characters like . or *.
Example :
|
|
Backtracking Solution
We define search(si, pi)
as the search function, which means if s[si:]
is matched by p[pi:]
.
First we need to check if p[pi]
is a wildcard: p[pi+1] == "*"
, and don’t forget the corner case: pi+1 < len(p)
- Not wildcard: Just check if
s[si]
andp[pi]
is same character:si < len(s) and s[si] == p[pi]
- If it is, search next level:
search(si+1, pi+1)
- If not, just return
False
, search ends here.
- If it is, search next level:
- Wildcard: Same as step 1, check if it’s the same character:
- If it is, that’s a valid wildcard match. But we could still choose to match or skip this wildcard:
search(si+1, pi) or search(si, pi+2)
- If not, the only choice is to skip this wildcard:
search(si, pi+2)
- If it is, that’s a valid wildcard match. But we could still choose to match or skip this wildcard:
|
|
Some conditional branches could be merged by and
operator.
|
|
- The code could be easily optimized using a cache to avoid duplicate calculation. See Optimize Recursion.
- Always pay attention to corner cases in recursion.
- Below is an example for matching
aab
andc*a*ab
, with the dicision tree of the backtracking search path.
|
|
Dyamic Programming
First we can construct state transision from the backtracking method. Jut let search(si, pi)
to be dp[si][pi]
|
|
- We have a
(len(s) + 1, len(p) +1)
matrix, the iteration order is described below. - Starting from
dp[len(s)][len(p)-1]
, we need to getdp[0][0]
- Set
dp[len(s)][len(p)] = True
, which indicates our backtracking terminal condition(find a match).
s\p | c | * | a | * | a | b | |
---|---|---|---|---|---|---|---|
a | end | F | |||||
a | F | ||||||
b | F | ||||||
start | T |
The core logic is pretty straightforward like backtracking solution:
|
|
Notes:
- This is similar to Distinct Subsequences