I need to find all the regex matches from a list of strings. For example, I need to be able to take the string "This foo is a foobar" and match any instances of either "foo" or "bar". What would the correct pattern be for this? Also, what input sanitation would I need to do to prevent the inputted text from breaking the pattern?
I'm a little unsure of what your actual question is. To match "foo" or "bar", you'd simply want "foo|bar"
for your pattern. If you want to do this to a list of strings, you'd likely want to check each string individually—you could join the strings first and check that, but I'm not sure this would be of much use. If you want to get the exact text that matched your pattern, you should surround the pattern in parentheses—such as "([fg]oo|[bt]ar)"
, which would match "foo", "goo", "bar", or "tar"—then use the Groups
property of the Match
object to retrieve these captures, so you can determine exactly which word matched. Groups[1]
is the first captured value (that is, the value in the first set of parentheses in your pattern). Groups[0]
is the entire match. You can also name your captures—"(?<word>[fg]oo|[bt]ar)"
—and refer to them by name—Groups["word"]
. I would recommend reading through the documentation on regular expression language elements.
As for sanitizing the input, there is no input that will "break" the regex. It might prevent a match, but that's really kinda what regexes are all about, isn't it?