The following Groovy commands illustrate my problem.
First of all, this works (as seen on lotrepls.appspot.com) as expected (note that \u0061
is 'a'
).
>>> print "a".matches(/\u0061/)
true
Now let's say that we want to match \n
, using the Unicode escape \u000A
. The following, using "pattern"
as a string, behaves as expected:
>>> print "\n".matches("\u000A");
Interpreter exception: com.google.lotrepls.shared.InterpreterException:
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed,
Script1.groovy: 1: expecting anything but ''\n''; got it anyway
@ line 1, column 21. 1 error
This is expected because in Java at least, Unicode escapes are processed early (JLS 3.3), so:
print "\n".matches("\u000A")
really is the same as:
print "\n".matches("
")
The fix is to escape the Unicode escape, and let the regex engine process it, as follows:
>>> print "\n".matches("\\u000A")
true
Now here's the question part: how can we get this to work with the Groovy /pattern/
syntax instead of using string literal?
Here are some failed attempts:
>>> print "\n".matches(/\u000A/)
Interpreter exception: com.google.lotrepls.shared.InterpreterException:
org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed,
Script1.groovy: 1: expecting EOF, found '(' @ line 1, column 19.
1 error
>>> print "\n".matches(/\\u000A/)
false
>>> print "\\u000A".matches(/\\u000A/);
true