I have two token types in my lexer defined like this:
NUMBERVALUE
: ( '0' .. '9' )+ ( '.' ( '0' .. '9' )+ )?
;
DATEVALUE
: ( '0' .. '9' ) ( '0' .. '9' ) ( '0' .. '9' ) ( '0' .. '9' ) '-'
( '0' .. '9' ) ( '0' .. '9' ) '-'
( '0' .. '9' ) ( '0' .. '9' )
| ( '0' .. '9' ) ( '0' .. '9' ) '-'
( '0' .. '9' ) ( '0' .. '9' ) '-'
( '0' .. '9' ) ( '0' .. '9' )
;
I would have thought that, since dates must contain a hyphen within the first five characters, then setting k=5 in the lexer options would be enough that the lexer could always tell the two apart. However, I'm getting this warning when I run antlr:
warning:lexical nondeterminism between rules NUMBERVALUE and DATEVALUE upon
k==1:'0'..'9'
k==2:'0'..'9'
k==3:'0'..'9'
k==4:'0'..'9'
k==5:'0'..'9'
and the parser doesn't recognise numbers with more than four digits in them. How do I resolve the lexical ambiguity?