As part of a project I have assigned myself as a way of improving my knowledge of F# and functional programming in general, I am attempting to write a string pattern-matching algorithm from scratch without using any loops or variables (or regular expressions, or String.Replace and friends). As this is purely a learning project, I'm not interested in the best possible way to do it, just the best functional way to do it.
I'm trying to write a function that accepts a wildcard character, a pattern string, and an input string as parameters. If the pattern does not match the input, the function returns None
. If the pattern does match the input, the function returns Some(str)
where str
is whatever part of the input string matched any wildcards that might have been present in the pattern string.
I have this mostly working, and I'll include the code in a moment. I've written a generic pattern-matching function that works on any generic list of anything that supports equality, and then a helper function that takes strings and passes lists of characters to the generic function. This all works, except for one thing: the support for multiple wildcards in the pattern string isn't very good - it takes the matches for each wildcard and concatenates them together into a single string in the output.
For example:
> strMatch '*' "foo" "bar";;
val it : string option = None
> strMatch '*' "test" "test";;
val it : string option = Some ""
> strMatch '*' "functional programming is *" "functional programming is fun";;
val it : string option = Some "fun"
> strMatch '*' "* and *" "you and me";;
val it : string option = Some "youme"
It's the last one that I'm trying to fix. Ideally I'd like to return a list of strings rather than a single string, with each element in the list being the string that matched one wildcard. Failing that, I can probably make do with a version that returns only the match for the first wildcard - it's the concatenated values from both wildcards that I need to get rid of. I'm just not quite sure how to approach it.
So if anyone can suggest how I can group my return values by which wildcard they matched, I would be grateful. I'm also interested in any other improvements to my code that you might want to suggest.
let rec doMatch (wildcard:'a) (pat:'a list) (input:'a list) : 'a list option =
let singleMatch p i =
match (p, i) with
| phd :: ptl, ihd :: itl ->
if phd = wildcard then
match doMatch wildcard ptl itl with
| None -> None
| Some x -> Some(ihd :: x)
else None
| _ -> None
let longerMatch p i =
match (p, i) with
| phd :: ptl, ihd :: itl ->
if phd = wildcard then
match doMatch wildcard p itl with
| None -> None
| Some x -> Some(ihd :: x)
else None
| _ -> None
match (pat, input) with
| [], [] -> Some([])
| [], _::_ -> None
| _::_, [] -> None
| phd :: ptl, ihd :: itl ->
if phd <> wildcard then
if phd = ihd then doMatch wildcard ptl itl
else None
else
match singleMatch pat input with
| Some x -> Some(x)
| None -> longerMatch pat input
let strMatch (wildcard:char) (pat:string) (input:string) =
match doMatch wildcard (List.ofSeq pat) (List.ofSeq input) with
| None -> None
| Some x -> Some(new string(Array.ofList x))
You've probably guessed, but this is part of an Eliza chat-bot implementation in F#.