views:

575

answers:

3

Hi,

Does anyone know why C# (.NET)'s StartsWith function is considerably slower than IsPrefix?

+4  A: 

StartsWith calls IsPrefix internally. It assigns culture info before calling IsPrefix.

dommer
+2  A: 

Good question; for a test, I get:

9156ms; chk: 50000000
6887ms; chk: 50000000

Test rig:

using System;
using System.Diagnostics;
using System.Globalization;    

class Program
{
    static void Main()
    {
        string s1 = "abcdefghijklmnopqrstuvwxyz", s2 = "abcdefg";

        const int LOOP = 50000000;
        int chk = 0;
        Stopwatch watch = Stopwatch.StartNew();
        for (int i = 0; i < LOOP; i++)
        {
            if (s1.StartsWith(s2)) chk++;
        }
        watch.Stop();
        Console.WriteLine(watch.ElapsedMilliseconds + "ms; chk: " + chk);

        chk = 0;
        watch = Stopwatch.StartNew();

        CompareInfo ci = CultureInfo.CurrentCulture.CompareInfo;
        for (int i = 0; i < LOOP; i++)
        {
            if (ci.IsPrefix(s1, s2)) chk++;
        }
        watch.Stop();
        Console.WriteLine(watch.ElapsedMilliseconds + "ms; chk: " + chk);
    }
}
Marc Gravell
+6  A: 

I think it's mostly fetching the thread's current culture.

If you change Marc's test to use this form of String.StartsWith:

    Stopwatch watch = Stopwatch.StartNew();
    CultureInfo cc = CultureInfo.CurrentCulture;
    for (int i = 0; i < LOOP; i++)
    {
        if (s1.StartsWith(s2, false, cc)) chk++;
    }
    watch.Stop();
    Console.WriteLine(watch.ElapsedMilliseconds + "ms; chk: " + chk);

it comes a lot closer.

If you use s1.StartsWith(s2, StringComparison.Ordinal) it's a lot faster than using CompareInfo.IsPrefix (depending on the CompareInfo of course). On my box the results are (not scientifically):

  • s1.StartsWith(s2): 6914ms
  • s1.StartsWith(s2, false, culture): 5568ms
  • compare.IsPrefix(s1, s2): 5200ms
  • s1.StartsWith(s2, StringComparison.Ordinal): 1393ms

Obviously that's because it's really just comparing 16 bit integers at each point, which is pretty cheap. If you don't want culture-sensitive checking, and performance is particularly important to you, that's the overload I'd use.

Jon Skeet