Hi there,
I am currently iterating over somewhere between 7000 and 10000 text definitions varying in size between 0 and 5000 characters and I want to check whether a particular string exists in any of them. I want to do this for somewhere in the region of 5000 different string definitions.
In most cases I just want to to know an exact case-insensitive match however sometimes a regex is required to be more specific. I was wondering though whether it would be quicker to use another "search" technique when the regex isn't required.
A slimmed version of the code looks something like this.
foreach (string find in stringsiWantToFind)
{
Regex rx = new Regex(find, RegexOptions.IgnoreCase);
foreach (String s in listOfText)
if (rx.IsMatch(s))
find.FoundIn(s);
}
I've read around a bit to see whether I'm missing anything obvious. There are a number of suggestions for using Compliled regexs however I can't see that is helpful given the "dynamic" nature of the regex.
I also read an interesting article on CodeProject so I'm just about to look at using the "FastIndexOf" to see how it compares in performance.
I just wondered if anybody had any advice for this kind of problem and how performance can potentially be optimized?
Thanks