This is just a question to satisfy my curiosity. But to me it is interesting.
I wrote this little simple benchmark. It calls 3 variants of Regexp execution in a random order a few thousand times:
Basically, I use the same pattern but in different ways.
Your ordinary way without any
RegexOptions. Starting with .NET 2.0 these do not get cached. But should be "cached" because it is held in a pretty global scope and not reset.With
RegexOptions.CompiledWith a call to the static
Regex.Match(pattern, input)which does get cached in .NET 2.0
Here is the code:
static List<string> Strings = new List<string>();
static string pattern = ".*_([0-9]+)\\.([^\\.])$";
static Regex Rex = new Regex(pattern);
static Regex RexCompiled = new Regex(pattern, RegexOptions.Compiled);
static Random Rand = new Random(123);
static Stopwatch S1 = new Stopwatch();
static Stopwatch S2 = new Stopwatch();
static Stopwatch S3 = new Stopwatch();
static void Main()
{
int k = 0;
int c = 0;
int c1 = 0;
int c2 = 0;
int c3 = 0;
for (int i = 0; i < 50; i++)
{
Strings.Add("file_" + Rand.Next().ToString() + ".ext");
}
int m = 10000;
for (int j = 0; j < m; j++)
{
c = Rand.Next(1, 4);
if (c == 1)
{
c1++;
k = 0;
S1.Start();
foreach (var item in Strings)
{
var m1 = Rex.Match(item);
if (m1.Success) { k++; };
}
S1.Stop();
}
else if (c == 2)
{
c2++;
k = 0;
S2.Start();
foreach (var item in Strings)
{
var m2 = RexCompiled.Match(item);
if (m2.Success) { k++; };
}
S2.Stop();
}
else if (c == 3)
{
c3++;
k = 0;
S3.Start();
foreach (var item in Strings)
{
var m3 = Regex.Match(item, pattern);
if (m3.Success) { k++; };
}
S3.Stop();
}
}
Console.WriteLine("c: {0}", c1);
Console.WriteLine("Total milliseconds: " + (S1.Elapsed.TotalMilliseconds).ToString());
Console.WriteLine("Adjusted milliseconds: " + (S1.Elapsed.TotalMilliseconds).ToString());
Console.WriteLine("c: {0}", c2);
Console.WriteLine("Total milliseconds: " + (S2.Elapsed.TotalMilliseconds).ToString());
Console.WriteLine("Adjusted milliseconds: " + (S2.Elapsed.TotalMilliseconds*((float)c2/(float)c1)).ToString());
Console.WriteLine("c: {0}", c3);
Console.WriteLine("Total milliseconds: " + (S3.Elapsed.TotalMilliseconds).ToString());
Console.WriteLine("Adjusted milliseconds: " + (S3.Elapsed.TotalMilliseconds*((float)c3/(float)c1)).ToString());
}
Everytime I call it the result is along the lines of:
Not compiled and not automatically cached:
Total milliseconds: 6185,2704
Adjusted milliseconds: 6185,2704
Compiled and not automatically cached:
Total milliseconds: 2562,2519
Adjusted milliseconds: 2551,56949184038
Not compiled and automatically cached:
Total milliseconds: 2378,823
Adjusted milliseconds: 2336,3187176891
So there you have it. Not much, but about 7-8% difference.
It is not the only mystery. I cannot explain why the first way would be that much slower because it is never re-evaluated but held in a global static variable.
By the way, this is on .Net 3.5 and Mono 2.2 which behave exactly the same. On Windows.
So, any ideas, why the compiled variant would even fall behind?
EDIT1:
After fixing the code the results now look like this:
Not compiled and not automatically cached:
Total milliseconds: 6456,5711
Adjusted milliseconds: 6456,5711
Compiled and not automatically cached:
Total milliseconds: 2668,9028
Adjusted milliseconds: 2657,77574842168
Not compiled and automatically cached:
Total milliseconds: 6637,5472
Adjusted milliseconds: 6518,94897724836
Which pretty much obsoletes all of the other questions as well.
Thanks for the answers.