tags:

views:

300

answers:

2

Consider the following function arguments (they are already extracted of the function):

Monkey,"Blue Monkey", "Red, blue and \"Green'",  'Red, blue and "Green\''

Is there a way to extract arguments to get the following array ouput using regexp and stripping white spaces:

[Monkey, "Blue Monkey", "Red, blue and \"Green'", 'Red, blue and "Green\'']

I'm stuck using this RegExp which is not permisive enough:

/(("[^"]+"|[^\s,]+))/g
A: 

Not sure exactly what you're seeking, nor yet how to do this in SQL, but isn't something like this sufficient:

(Using python as an example)

import re
x = '''Monkey, "Blue Monkey", "Red, blue and "Green\\"", 'Red, blue and "Green\\'\''''
l = re.split(',\s*',x)
print x
for a in l:
    print a
Brent.Longborough
This regexp wont work for items that are not separated with whitespace.Sorry forgot to mention it.
Just change the ',\s+' to ',\s*' to make the spacing optional
Chad Birch
OK, that's done. Thanks for the comment.
Brent.Longborough
A: 

This looks a little nasty but it works:

/(?:"(?:[^\x5C"]+|\x5C(?:\x5C\x5C)*[\x5C"])*"|'(?:[^\x5C']+|\x5C(?:\x5C\x5C)*[\x5C'])*'|[^"',]+)+/g

I used \x5C instead of the plain backslash character \ as too much of those can be confusing.

This regular expression consists of the parts:

  1. "(?:[^\x5C"]+|\x5C(?:\x5C\x5C)*[\x5C"])*" matches double quoted string declarations
  2. '(?:[^\x5C']+|\x5C(?:\x5C\x5C)*[\x5C'])*' matches single quoted string declarations
  3. [^"',]+ matches anything else (except commas).

The parts of "(?:[^\x5C"]+|\x5C(?:\x5C\x5C)*[\x5C"])*" are:

  1. [^\x5C"]+ matches anything except the backspace and quote character
  2. \x5C(?:\x5C\x5C)*[\x5C"] matches proper escape sequences like \", \\, \\\", \\\\, etc.
Gumbo