views:

286

answers:

3

I'm reading Douglas Crockfords Javascript: The Good Parts, I just finished the regular expressions chapter. In this chapter he calls javascript's \b, positive lookahead (?=) and negative lookahead (?!) "not a good part"

He explains the reason for \b being not good (it uses \w for word boundary finding, and \w fails for any language that uses unicode characters), and that looks like a very good reason to me.

Unfortunately, the reason for positive and negative lookahead being not good is left out, and I cannot come up with one. Mastering regular expressions showed me the power that comes with lookahead (and of course explains the issues it brings with it), but I can't really think of anything that would qualify it as "not a good part".

Can anyone explain why javascript (positive|negative)lookahead or (positive|negative)lookahead in general should be considered "not good"?


It seems I'm not the only one with this question (outside stackoverflow fortunately): http://jjinux.blogspot.com/2009/03/books-javascript-good-parts-part-2.html
http://ourcraft.wordpress.com/2009/03/

+2  A: 

They're too hard for him?

Or: lookaheads and lookbehinds (the latter are not supported in JavaScript) increase regex times considerably. But one isn't typically regexing through huge amounts of data in JavaScript. So they're great; use them when they're useful.

BipedalShark
too hard seems unlikely... performance might be a reason, but that is more of an interpreter issue that could be solved than a language specification issue.
jilles de wit
I wasn't really being serious with the "too hard" bit. But interpreter issues seem like alright reasons to say a language feature ought to be avoided. But, again, I'd disagree with Crockford that there is any kind of issue that would make lookaheads worth avoiding. Lookarounds are fantastic.
BipedalShark
Crockford's a smart guy, and I figure he would know better than anyone why he concluded lookaheads are bad, so I shot him an email. If he replies, I will post it here (if he doesn't do so himself).
BipedalShark
no answer yet I gather? I've tried to contact mr. Crockford as well.
jilles de wit
I've yet to hear from him.
BipedalShark
+4  A: 

The only reason I can think of might be that they are only supported by about half of the popular regular expression engines, though if you limit yourself to universally supported syntax there are a lot of things you just cannot do.

By the way (positive|negative)(lookahead|lookbehind) is sometimes collectively referred to as "lookaround", as in this page that compares the support of features among various implementations:

http://www.regular-expressions.info/refflavors.html

Tim Sylvester
I was thinking in that direction too, but lack of support in general doesn't really qualify it as a bad part of a specific language that actually supports it.
jilles de wit
+7  A: 

Maybe it's because of Internet Explorer's perpetually buggy implementation of lookaheads. For anyone authoring a book about JavaScript, any feature that doesn't work in IE might as well not exist.

Alan Moore
I don't know what mr Crockford was thinking when he wrote this, but this seems the best reason to be careful with lookahead in javascript. It feels a bit unfair to blame the language for buggy implementations though.
jilles de wit