ansaurus

Question

Javascript - MultiLine RegExp: lastIndex stuck on newlines?

Answer 1

+5 A:

The problem is that the dot in

^(.*)$

does not match new line characters, but with your "m" switch you make "^" and "$" anchor to new line characters. That means the "nothing" between "\n" and "\n" can be matched successfully with "(.*)".

Since this match is of zero width, the lastIndex property cannot advance. Try:

^(.+)$

EDIT: To match the blank lines as well, do this:

^(.*)\n?     // remove all \r characters beforehand

or

^(.*)(?:\r\n|\n\r|\n|\r)?  // all possible CR/LF combinations, but *once* at most

...and just go for match group 1.

Tomalak 2008-12-19 15:13:32

This works, but effectively ignores the blank line. I want to match it, just not get stuck.

brad 2008-12-19 15:15:42

Changed my answer accordingly.

Tomalak 2008-12-19 15:22:52

Edited regex #3 again to account for the last line in the string which was previously not matched.

Tomalak 2008-12-19 15:51:14

Nice! I don't think \n\r is ever used though.Wikipedia-Newlines:http://en.wikipedia.org/wiki/Newline

brad 2008-12-19 16:42:15

Yes, I don't either. But with custom-generated strings/documents you never know. Might well be that someone got it wrong in is app. I have to look it up often enough when using Chr(13) and Chr(10), somehow it just doesn't stick.

Tomalak 2008-12-19 17:06:21

Answer 2

+1 A:

The problem with lastIndex is that a JavaScript implementation that follows the standard to the letter sets it to the offset of the next character after the match. For regular expressions, like yours, that allow zero-length matches, exec() will thus get stuck in an infinite loop when a zero-length match is found. The next match attempt will begin at the same position, where the same zero-length match is found.

Traditionally, regex engines deal with this by skipping one character when a zero-length match is found. Incidentally, Internet Explorer does this as well.

I've blogged about this in detail in the past: Watch Out for Zero-Length Matches

Jan Goyvaerts 2008-12-25 13:44:33

ansaurus

tags:

views:

answers:

Javascript - MultiLine RegExp: lastIndex stuck on newlines?

Some Context

The Problem

Conclusion

Notes

[edit]

[edit]

[edit]

[edit]

[edit]

related questions