/^"((?:[^"]|\\.)*)"/
Against this string:
"quote\_with\\escaped\"characters" more
It only matches until the \"
, although I've clearly defined \
as an escape character (and it matches \_
and \\
fine...).
/^"((?:[^"]|\\.)*)"/
Against this string:
"quote\_with\\escaped\"characters" more
It only matches until the \"
, although I've clearly defined \
as an escape character (and it matches \_
and \\
fine...).
It works correctly if you flip the order of your two alternatives:
/^"((?:\\.|[^"])*)"/
The problem is that otherwise the important \
character gets eaten up before it tries matching \"
. It worked before for \\
and \_
only because both characters in either pair get matched by your [^"]
.
Using Python with raw-string literals to ensure no further interpretation of escape sequences is taking place, the following variant does work:
import re
x = re.compile(r'^"((?:[^"\\]|\\.)*)"')
s = r'"quote\_with\\escaped\"characters" more"'
mo = x.match(s)
print mo.group()
emits "quote\_with\\escaped\"characters"
; I believe that in your version (which also interrupts the match precociously if substituted in here) the "not a doublequote" subexpression ([^"]
) is swallowing the backslashes that you intend to be taken as escaping the immediately-following characters. All I'm doing here is ensuring that such backslashes are NOT swallowed in this way, and, as I said, it seems to work with this change.
Not intend to confuse, just another information I've played around with. Below regexp(PCRE) try to not match wrong syntax (eg. end with \") and can use with both ' or "
/('|").*\\\1.*?[^\\]\1/
to use with php
<?php if (preg_match('/(\'|").*\\\\\1.*?[^\\\\]\1/', $subject)) return true; ?>
For:
"quote\_with\\escaped\"characters" "aaa"
'just \'another\' quote "example\"'
"Wrong syntax \"
"No escapes, no match here"
This only match:
"quote\_with\\escaped\"characters" and
'just \'another\' quote "example\"'