ansaurus

Question

fslex lexing javascript regular expressions

Answer 1

+1 A:

Hi,

As far as I know, there is no way to get the previous token (but I haven't tried that and I used FSLex quite some time ago). I guess you could keep a parameter specifying the last processed token and then use it to decide what to do when you find the "/" character.

Anyway, could you post some sample code that you currently have (e.g. just a part that deals with this problem)? It would be a lot easier to answer your question if we'd see some sample code (and if I could try pasting it into my Visual Studio and see if I can figure something out!)

T.

Tomas Petricek 2010-01-13 23:00:48

Answer 2

+1 A:

To get around this issue I created a module that keeps track of the last token, and looks in a list of valid tokens to see whether the "/" operator is a division operator or a regex.

The code is below:

let mutable lastToken:token = EOF

let setToken token =
    lastToken <- token
    token

let parseDivision (lexbuf:Lexing.lexbuf) (tokenizer:Lexing.LexBuffer<'a> -> JavascriptParser.token) regexer =
    match lastToken.GetType().Name with
    | x when invalidRegexPrefix |> List.contains(x) -> DIVIDE
    | _ -> 
        let result = (regexer lexbuf.StartPos "" lexbuf)
        REGEX(result)

And then inside the lexer I call setToken on the result of the rule. For example:

| '(' { setToken LPAREN }

setToken both sets the last token and returns the token that has just been set, this was only to make it be less intrusive on the actual lexer code.

The actual rule for the "/" character is:

| "/"   { setToken (parseDivision lexbuf token regex) }

One also needs to reset the token to EOF once the parsing is completed or you may be in an inconsistent state (since the last token is a static variable).

justin 2010-01-14 00:31:24

ansaurus

tags:

views:

answers:

fslex lexing javascript regular expressions

related questions