ansaurus

Question

What algorithm .Net use for searching a pattern in a string?

Answer 1

+2 A:

Why using reflector when the source code is available? http://weblogs.asp.net/scottgu/archive/2008/01/16/net-framework-library-source-code-now-available.aspx

Stephane 2010-04-06 10:45:29

That still wont show you any native/internal code, as this method does.

leppie 2010-04-06 10:46:29

That's true, only the .Net code is available, erf

Stephane 2010-04-06 10:48:01

I was trying to use the source code but it doesn't show me the algorithm. Am I doing something wrong or there is no way to see the real implementation?

Hun1Ahpu 2010-04-06 11:34:09

Answer 2

+2 A:

I’m guessing of course but I’d say that it’s just the naive string search implementation, having O(n*m) performance.

Now, this might be phenomenally wrong but the MSDN doesn’t specify the performance of this method so it’s not safe to assume a better performance.

Furthermore, most advanced pattern searching methods are quite specialized for certain string types and while there are better general-purpose search algorithms, implementing it in String.IndexOf is somewhat of an unnecessary optimization.

The reason is simple: if you require an efficient pattern search then you’ll implement your own anyway, custom-tailored to fit your particular data. So there’s no need to implement something fancy in the general-purpose library.

Konrad Rudolph 2010-04-06 10:48:57

Thank you for reasonable answer.

Hun1Ahpu 2010-04-06 14:17:07

I could not agree less; why would you implement your own general string search if there is already one in the library?Furthermore, string searching algorithms are exceptions since there is Boyer-Moore which is so nice (all pro no con) that is deemed standard benchmark. What I mean to say is that usually choosing an algorithm for a library can be really hard, but with strings I would not expect anything less to be used in any library worth anything. Please do correct me if I am wrong.

Unreason 2010-04-06 17:05:30

@Unreason: in most applications it simply doesn’t matter. Using `Find` for anything but trivial tasks results in complicated logic and code bloat anyway. Better use regular expressions or a parser. Where you *do* need to use pattern finding and where it matters, you rarely deal with Unicode. Often it’s highly specialized like DNA sequences. And *of course* you don’t program that yourself – you use ready-made libraries. But these are specialized and don’t belong in a general framework like the .NET framework.

Konrad Rudolph 2010-04-06 18:07:18

@Unreason: and Boyer-Moore depends critically on the size of the alphabet which is forbidding for Unicode. It’s possible to modify the algorithm to use hashing instead but already the parameters are getting murky. I think few people who work on a general framework like .NET know how to get this right (more efficient than the naive search in the general case and 100% bug-free) without making the implementation expensive.

Konrad Rudolph 2010-04-06 18:11:31

(cont’d) Finally, Boyer-Moore is by no means the last word; Horspool is often (most of the time?) more efficient (not to mention easier to implement). I’m sure there are modern variants which tweak Horspool even more. And, without having performed any benchmarks, I’ll wager that for most applications (i.e. *relatively* small Unicode strings) naive search outperforms Horspool hands down.

Konrad Rudolph 2010-04-06 18:17:26

+ for comment - makes sense :) For completeness sake here's another discussion that coincides with your reasoning http://yedda.com/questions/String_IndexOf_search_algorithm_net_1499176185184/

Unreason 2010-04-06 18:56:44

Answer 3

A:

I couldn't find that method stub, but my best guess is that it local-specific case-folding for case-insensitive searches.

leppie 2010-04-06 10:48:58

ansaurus

tags:

views:

answers:

What algorithm .Net use for searching a pattern in a string?

related questions