tags:

views:

190

answers:

2

I am attempting to lex javascript regular exression literals. These start with a "/" and end with a "/" (and sometimes some other modifiers). The issue is that the only way to determine whether it is a regular expression as opposed to a division operator is by reading the tokens previous to the "/" character.

One can read a little more on this here.

As it is, I can't find any documentation on how to get the previous token. Hopefully this is possible and someone can tell me how.

Thanks.

+1  A: 

Hi,

As far as I know, there is no way to get the previous token (but I haven't tried that and I used FSLex quite some time ago). I guess you could keep a parameter specifying the last processed token and then use it to decide what to do when you find the "/" character.

Anyway, could you post some sample code that you currently have (e.g. just a part that deals with this problem)? It would be a lot easier to answer your question if we'd see some sample code (and if I could try pasting it into my Visual Studio and see if I can figure something out!)

T.

Tomas Petricek
+1  A: 

To get around this issue I created a module that keeps track of the last token, and looks in a list of valid tokens to see whether the "/" operator is a division operator or a regex.

The code is below:

let mutable lastToken:token = EOF

let setToken token =
    lastToken <- token
    token

let parseDivision (lexbuf:Lexing.lexbuf) (tokenizer:Lexing.LexBuffer<'a> -> JavascriptParser.token) regexer =
    match lastToken.GetType().Name with
    | x when invalidRegexPrefix |> List.contains(x) -> DIVIDE
    | _ -> 
        let result = (regexer lexbuf.StartPos "" lexbuf)
        REGEX(result)

And then inside the lexer I call setToken on the result of the rule. For example:

| '(' { setToken LPAREN }

setToken both sets the last token and returns the token that has just been set, this was only to make it be less intrusive on the actual lexer code.

The actual rule for the "/" character is:

| "/"   { setToken (parseDivision lexbuf token regex) }

One also needs to reset the token to EOF once the parsing is completed or you may be in an inconsistent state (since the last token is a static variable).

justin