tags:

views:

199

answers:

3

Using c# regex I'm trying to match things in quotes which aren't also in brackets while also ignoring any white space:

"blah" - match
("blah") - no match
( "blah") - no match
(  "blah") - no match

I've got (unescaped):

"(?<=[^(]\s")(.*?)"

which works with the first three but I can't work out how to deal with more than one space between the first bracket and the quote. Using a + after the s is the same result, using a * means both the last two match. Any ideas?

+1  A: 

In PCRE as I know it, lookbehinds have to be fixed-width. If that remains true in C#'s PCRE engine, then you're not going to be able to do it the way you're trying to.

chaos
No, .NET (the regex engine is not specific to C#) does support variable length look-behinds.
Lucero
Ah, lovely then.
chaos
+2  A: 

This should work:

/(?<![^(\s])\s*"([^"]*)"\s*(?![\s)])/
  • The first (?<![^(\s]) asserts that there is no whitespace or left parenthesis before the string.

  • Then \s* will match any number of whitespace characters.

  • ("[^"]*") will match a quoted string, and capture it's content.

  • \s* will match any number of whitespace characters.

  • Last, (?![\s)]) will assert that there is no whitespace or right-parenthesis following.

Together they make sure that all the whitespace is matched by each \s*, and that they are not bordering a parenthesis.

MizardX
Great, thank you
Patrick
+1  A: 

Look behinds need a fixed width, but you might be able to get there with the expression below. This assumes no nesting.

/\G                 # from the spot of the last match
  (?:               # GROUP OF: 
     [^("]*           # anything but open-paren and double quote.
     [(]              # an open-paren
     [^)]*            # anything but closing-paren
     [)]              # a closing-paren
  )*                # any number of times 
  [^"]*             # anything but double quote

  "([^"]*)"         # quote, sequence of anything except quote, then ending quote
/x
Axeman
If there is any closing paren within quotes, it will still be considered a closing paren.
MizardX