The full power of regular expressions and finite state machines are not needed to solve this problem. As an alternative there is a relatively simple dynamic programming solution.
Let match(i, j) be 1 if it is possible to match the the sub-string string[i..n-1] with the sub-pattern pattern[j, m - 1], where n and m are the lengths of string and pattern respectively. Otherwise let match(i, j) be 0.
The base cases are:
match(n, m) = 1, you can match an empty string with an empty pattern;
match(i, m) = 0, you can't match a non-empty string with an empty pattern;
The transition is divided into 3 cases depending on whether the current sub-pattern starts with a character followed by a '*', or a character followed by a '?' or just starts with a character with no special symbol after it.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int is_match(char* pattern, char* string)
{
int n = strlen(string);
int m = strlen(pattern);
int i, j;
int **match;
match = (int **) malloc((n + 1) * sizeof(int *));
for(i = 0; i <= n; i++) {
match[i] = (int *) malloc((m + 1) * sizeof(int));
}
for(i = n; i >= 0; i--) {
for(j = m; j >= 0; j--) {
if(i == n && j == m) {
match[i][j] = 1;
}
else if(i < n && j == m) {
match[i][j] = 0;
}
else {
match[i][j] = 0;
if(pattern[j + 1] == '*') {
if(match[i][j + 2]) match[i][j] = 1;
if(i < n && pattern[j] == string[i] && match[i + 1][j]) match[i][j] = 1;
}
else if(pattern[j + 1] == '?') {
if(match[i][j + 2]) match[i][j] = 1;
if(i < n && pattern[j] == string[i] && match[i + 1][j + 2]) match[i][j] = 1;
}
else if(i < n && pattern[j] == string[i] && match[i + 1][j + 1]) {
match[i][j] = 1;
}
}
}
}
int result = match[0][0];
for(i = 0; i <= n; i++) {
free(match[i]);
}
free(match);
return result;
}
int main(void)
{
printf("is_match(dummy, dummy) = %d\n", is_match("dummy","dummy"));
printf("is_match(dumm?y, dummy) = %d\n", is_match("dumm?y","dummy"));
printf("is_match(dum?y, dummy) = %d\n", is_match("dum?y","dummy"));
printf("is_match(dum*y, dummy) = %d\n", is_match("dum*y","dummy"));
system("pause");
return 0;
}
The time complexity of this approach is O(n * m). The memory complexity is also O(n * m) but with a simple modification can be reduced to O(m).