views:

48

answers:

1

Hello, everyone! I'm quite new to regular expressions, but I like them, A LOT!

Call me nitpicky if you will, but I'd really like to know if I should avoid using lookaheads and lookbehinds if I have an option.

For example, the two commands below do the same thing, one uses lookbehind and the other doesn't.

the_str = Regex.Replace(the_str, @"(;|!|\?) \.{3}", "$1...");

the_str = Regex.Replace(the_str, @"(?<=(;|!|\?)) \.{3}", "...");

Which one would you use? Which is more efficient?

Thanks for your answers!

+3  A: 

I tested both locally and the method using a lookbehind was about 25% slower.

Another variation I tested using a lookahead instead of a lookbehind was only 10% slower:

s = Regex.Replace(s, @"(;|!|\?) (?=\.{3})", "$1");

I don't think there's enough of a performance difference to advise always avoiding lookarounds. If you think it makes the code more readable then do use them. Only optimize for performance if profiling shows that you have performance problem and the regular expression is the bottleneck.

For information, the string I tested on was "blah; ... foo ...; bar bar ? ..." repeated 1000 times and I repeated each test 100 times.

0.944s   No lookarounds    Regex.Replace(s, @"(;|!|\?) \.{3}", "$1...") 
1.027s   Look ahead        Regex.Replace(s, @"(;|!|\?) (?=\.{3})", "$1")
1.210s   Look behind       Regex.Replace(s, @"(?<=(;|!|\?)) \.{3}", "...")
1.124s   Both              Regex.Replace(s, @"(?<=(;|!|\?)) (?=\.{3})", "")
Mark Byers