views:

43

answers:

1

We're having a lot of trouble tracking down the source of \u2028 (Line Separator) in user submitted data which causes the 'unterminated string literal' error in Firefox.

As a result, we're looking at filtering it out before submitting it to the server (and then the database).

After extensive googling and reading of other people's problems, it's clear I have to filter these characters out before submitting to the database.

Before writing the filter, I attempted to search for the character just to ensure it can find it using:

var index = content.search("/\u2028/");
alert("Index: [" + index + "]");

I get -1 as the result everytime, even when I know the character is in the content variable (I've confirmed via a Java jUnit test on the server side).

Assuming that content.replace() would work the same way as search(), is there something I'm doing wrong or anything I'm missing in order to find and strip these line separators?

+3  A: 

Your regex syntax is incorrect. You only use the two forward slashes when using a regex literal. It should be just:

var index = content.search("\u2028");

or:

var index = content.search(/\u2028/); // regex literal

But this should really be done on the server, if anywhere. JavaScript sanitization can be trivially bypassed. It's only useful for user convenience, and I don't think accidentally entering line separator is that common.

Matthew Flaschen
@Cyntech: And if you really need that to be part of a RegExp: var `index = content.search(/\u2028/);` (with the slashes, but without the quotes).
T.J. Crowder
Thanks Matthew and TJ. That solved the riddle. I was also able to successfully replace u2028 characters.However, I also read that I should probably filter out u2029 as well. When I use content.replace(/\u2028\u2029/g," ");, it doesn't replace anything (not even the u2028 characters). Have I got the syntax wrong for multiple searches?
Cyntech
Actually, I think I just worked it out... content.replace(/\u2028|\u2029/g, ' '); Is this correct?
Cyntech
@Cyntech, typically, for individual characters, you use a character class: `/[\u2028\u2029]/g`. And don't forget you need to do this on the server (too).
Matthew Flaschen
Thanks Matthew! Much appreciated.
Cyntech