views:

132

answers:

1

I'm trying to figure out how to split a string into searchable terms. I need it to

  • split on spaces and single quotes (ignoring single character, non-quoted results)
  • return quoted phrases without the quotes

So if I'm applying it to: "quoted phrase" single words It would return

  • quoted phrase
  • single
  • words

Here's what I have so far (in Javascript), but I have to have to strip the quotes out separately.

var searchArray = temp.match(/"[^"]*"|[^\s']{2,}/g);
for (index in searchArray)
    searchArray[index] = searchArray[index].replace(/"/g, '');

Is there any way to do this using only one regular expression?

+3  A: 

This seems to work but I'm not sure I've covered all cases. I'm not sure it'll work in IE 5, but that may not worry you; it works in IE 6 and all other browsers I've tried. It also strips leading and trailing whitespace from matches inside quotes:

var regex = /("?)\s*\b(\S[^\1]*?)\b\s*\1/g;
var str = '"quoted phrase " single "quoted" words " yes "';
var res;

while ( (res = regex.exec(str)) ) {
    alert(res[2]);
}
Tim Down
That looks like that'll do what I was looking for. Thank you very much! I'd like to mark your answer as useful, but I don't have enough reputation yet.
Joe