views:

41

answers:

3

Hello,

I have the following regex in PHP:

/(?<=\')[^\'\s][^\']*+(?=\')|(?<=")[^"\s][^"]*+(?=")|[^\'",\s]+/

and I would like to port it to javascript like:

var regex = new RegExp('/(?<=\')[^\'\s][^\']*+(?=\')|(?<=")[^"\s][^"]*+(?=")|[^\'",\s]+/');

var match = regex.exec("hello,my,name,is,'mr jim'")

for( var z in match) alert(match[z]);

There is something that JavaScript doesnt like here, but I have no idea what it is. I've tried looking for diferences between PHP and JS regex via regular-expressions.info but I cant see anything obvious.

Any help would be greatly appreciated

Thank you again

EDIT: The problem seems to lie within the positive lookbehind's but does this mean it cannot be ported?

+1  A: 

it's (?<=) positive look-behind what Javascript doesn't support. but be aware that Javascript implementation in different browsers vary significantly.

Edit: there is an SO question devoted to workaround.

SilentGhost
Would you say it's not worth attempting to port it, and do all the work server side instead of client side?
Jamie Bicknell
+2  A: 

Correct - the positive lookbehinds will not work.

But, just as some general information about regex in Javascript, here's a couple pointers for you.

You don't have to use the RegExp object - you can use pattern literals instead

var regex = /^[a-z\d]+$/i;

But if you use the RegExp object, you have to escape your backslashes since your pattern is now locked in a string.

var regex = new RegExp( '^[a-z\\d]+$', 'i' );

The primary benefit of the RegExp object is if there is a dynamic bit to your pattern, for example

var max = 4;
var regex = new RegExp( '\d{1,' + max + '}' );
Peter Bailey
Thank you Peter, as you can probably tell I normally stick to PHP, and very rarely use regex in either language. Thank you for your advice, will note this down for future reference.
Jamie Bicknell
+1  A: 

You don't get lookbehind (and lookahead has problems in IE, so is best avoided too). But it's easy to just let those ' and " characters be part of the match, and throw them out afterwards:

var value= "hello,my,name,is,'mr jim'";
var match;
var r= /'[^'\s][^']*'|"[^"\s][^"]*"|[^'",\s]+/g;

while(match= r.exec(value)) {
    var text= match[0];
    if ('"\''.indexOf(text.charAt(0))!=-1) // starts with ' or "?
        text= text.substring(1, text.length-1);
    alert(text);
}

Or, use capturing parentheses to isolate the quotes from the text:

var r= /'([^'\s][^']*)'|"([^"\s][^"]*)"|([^'",\s]+)/g;

while (match= r.exec(value)) {
    var text= match[1] || match[2] || match[3];
    alert(text);
}

(I'm guessing your for(var z in match) was supposed to loop over each pattern match in the string. Unfortunately JavaScript doesn't quite work that easily.)

This may not be the best way to parse a comma-separated list; it seems a bit ill-defined for cases where you have a space or quote in the middle of a field. A simple string-indexing parser might be a better bet.

bobince
This is perfect. I stripped out the regex earlier of the lookbehinds and then used a replace function to remove the quote marks, your methods are far better! Thank you!
Jamie Bicknell