To get around this issue I created a module that keeps track of the last token, and looks in a list of valid tokens to see whether the "/" operator is a division operator or a regex.
The code is below:
let mutable lastToken:token = EOF
let setToken token =
lastToken <- token
token
let parseDivision (lexbuf:Lexing.lexbuf) (tokenizer:Lexing.LexBuffer<'a> -> JavascriptParser.token) regexer =
match lastToken.GetType().Name with
| x when invalidRegexPrefix |> List.contains(x) -> DIVIDE
| _ ->
let result = (regexer lexbuf.StartPos "" lexbuf)
REGEX(result)
And then inside the lexer I call setToken on the result of the rule. For example:
| '(' { setToken LPAREN }
setToken both sets the last token and returns the token that has just been set, this was only to make it be less intrusive on the actual lexer code.
The actual rule for the "/" character is:
| "/" { setToken (parseDivision lexbuf token regex) }
One also needs to reset the token to EOF once the parsing is completed or you may be in an inconsistent state (since the last token is a static variable).