views:

155

answers:

3
+2  Q: 

Javascript regex

Hi all, Currently I have a basic regex in javascript for replacing all whitespace in a string with a semi colon. Some of the characters within the string contain quotes. Ideally I would like to replace white space with a semi colon with the exception of whitespace within quotes.

var stringin = "\"james johnson\" joe \"wendy johnson\" tony";
var stringout = stringin.replace(/\s+/g, ":");
alert(stringout);

Thanks Robin

+11  A: 

Try something like this:

var stringin = "\"james johnson\" joe \"wendy johnson\" tony";
var stringout = stringin.replace(/\s+(?=([^"]*"[^"]*")*[^"]*$)/g, ":");

Note that it will break when there are escaped quotes in your string:

"ab \" cd" ef "gh ij"
Bart Kiers
Very clever. I like it. Have tried to break it with no success.
Tim Pietzcker
Thanks Tim. I updated a case when it will break. But the OP didn't mention escaped quotes could occur.
Bart Kiers
Good point. You could work around that like so:`\s+(?=((?:\\"|[^"])*(?<!\\)"(?:\\"|[^"])*(?<!\\)")*(?:\\"|[^"])*$)`But now that's *really* fugly.
Tim Pietzcker
Yes, but what about strings that have an escaped backslash right before the closing quote: `ab "cd ef\\" gh`? When escaped characters come into play, it is time to leave regex-land, IMO.
Bart Kiers
Oh, and b.t.w., JavaScript does not support look behinds, only look aheads.
Bart Kiers
+2  A: 

In this case regexes alone are not the simplest way to do it:

<html><body><script>

var stringin = "\"james \\\"johnson\" joe \"wendy johnson\" tony";
var splitstring = stringin.match (/"(?:\\"|[^"])+"|\S+/g);
var stringout = splitstring.join (":");
alert(stringout);

</script></body></html>

Here the complicated regex containing \\" is for the case that you want escaped quotes like \" within the quoted strings to also work. If you don't need that, the fourth line can be simplified to

var splitstring = stringin.match (/"[^"]+"|\S+/g);
Kinopiko
You'll want to swap `[^"]` and `\\"` around. Otherwise the backslash which is meant as an escape is "eaten" by the `[^"]`. Or add the backslash in your negated character class. Other than that, your way is a more intuitive approach (and in case of escaped characters, a working solution). +1
Bart Kiers
Thanks for pointing that out.
Kinopiko
+2  A: 

in javascript, you can easily make fancy replacements with callbacks

 var str = '"james \\"bigboy\\" johnson" joe "wendy johnson" tony';

 alert(
  str.replace(/("(\\.|[^"])*")|\s+/g, function($0, $1) { return $1 || ":" })
 );
stereofrog