views:

540

answers:

3

I'm learning F# because I'd like to write a lexer and parser. I have a tiny bit of experience with this sort of processing but really need to learn it properly as well as F#.

When learning the lexing/parsing functionality of F#, is studying lex and yacc sufficient?

Or are there some differences that means code for lex/yacc will not work with fslex and fsyacc?

+2  A: 

Well, with lex and yacc, you put C/C++ code in the 'actions', whereas with fslex and fsyacc you put F# code there, but I presume you know this?

I think they are otherwise based on the same (established/ancient) tokenizing and parsing technologies, so the general structure/behavior of the grammar should be similar, if that's what you're after...

Brian
I'm still learning both F# and lex/yacc. The nature of functional programming seems to make it harder to debug so I don't want to run into weird behaviour because I'm using the wrong syntax! :-)
Alex Angas
+4  A: 

I personally found these OcamlLex and OcamlYacc tutorials excellent resources to get started -- easy to follow, and you can translate most everything in those tutorials for FsLex/FsYacc almost verbatim.

Juliet
A: 

I have designed a language for generating reports, like this:

The person $1 has age $2 |=> $1 = Name, $2 = Age

Feeding this line to the parser with an object of type:

class person { public String Name = "Joe"; public int Age = 23; }

Will generate the line: "The person Joe has age 23" for me. Of course the language has some extra features:

  • padding:

    The item between << and >> will be 5 character long: <> |=> $1 = InputString

  • repeating text horizontally:

    $1....$2.... |=> OrderLines[$1 = Quantity, $2 = Price]

  • repeating text vertically:

    $1....$2.... |=> ColumnGroups![$1 = Column1, $2 = Column2]

  • formatting:

    fillcoefficient $1 at Date $2 |=> $1 = fillcoefficient(000.00%), $2 = Date(ddmmyyyy)

It contains other features as well, but the point of the language is now illustrated.

When trying to parse this, consider the string: 000.00%. Is this a token? And if so what kind? Considering:

000.00% $1 => $1 = Number(000.00%)

is a valid line.

How would I parse this language using fslex? Any Ideas?

NIck
-1: This is not a discussion board, create a new question for this and don't hijack other, unrelated questions.
dbemerlin