ansaurus

Question

What is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner?

Answer 1

A:

Eric Lippert's blog series on immutable binary trees may be helpful. Obviously, you need a tree which is not binary, but it will give you the general idea.

Craig Stuntz 2010-08-05 14:11:15

Answer 2

+9 A:

You want to parse something into an abstract syntax tree.

In the purely functional programming language Haskell, you can use parser combinators to express your grammar. Here an example that parses a tiny expression language:

EDIT Use monadic style to match Graham Hutton's book

    -- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)

    -- abstract syntax tree
data Expr = I Int
          | Add Expr Expr
          | Mul Expr Expr
          deriving (Eq,Show)

    -- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
    where

        -- grammar
    pExpr :: Parser String Expr
    pExpr = do
        a <- pNumber +++ parentheses pExpr  -- first argument
        do
            f <- pOp                        -- operation symbol
            b <- pExpr                      -- second argument
            return (f a b)
         +++ return a                       -- or just the first argument

    parentheses parser = do                 -- parse inside parentheses
        string "("
        x <- parser
        string ")"
        return x

    pNumber = do                            -- parse an integer
        digits <- many1 digit
        return . I . read $ digits

    pOp =                                   -- parse an operation symbol
         do string "+"
            return Add
        +++ 
         do string "*"
            return Mul

Here an example run:

*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))

To learn more about parser combinators, see for example chapter 8 of Graham Hutton's book "Programming in Haskell" or chapter 16 of "Real World Haskell".

Many parser combinator library can be used with different token types, as you intend to do. Token streams are usually represented as lists of tokens [Token].

Heinrich Apfelmus 2010-08-05 16:58:25

I have to admit, this is rough for me to understand. In my mind I am imagining a state machine passing along continuations but it will definitely take some time for me to understand this.

ChaosPandion 2010-08-06 00:02:15

The key about here is that you don't need to implement a state machine or something anymore, it's already done for you in the library of parser combinators an its `Parser tokens a` type. You can just specify the grammar abstractly and let the library figure out what to do with it.

Heinrich Apfelmus 2010-08-06 08:31:45

@Heinrich - There are two ways to understand a topic (grammars/parser/compiler in this case). The first way is to simply be smart enough to understand it with little effort (This is how I worked before I decided to learn advanced computer science.). Once I found out I am not that smart, I decided that the only way I will be able to learn this is through the second method. The second method is simple dedication. Since then I have written 4 ECMAScript parsers, each getting progressively better as I learned new techniques. However they have all been imperative.

ChaosPandion 2010-08-06 16:10:42

@Heinrich - If I can complete a functionally pure parser from start to finish I will look at that as the capstone of my learning experience. After that I will begin a detailed study of the tools and libraries available to me. If I jump straight into libraries I feel like I may be cheating myself.

ChaosPandion 2010-08-06 16:12:15

Your outstanding dedication to learning is best complemented by a good book. :-) In your case, I strongly recommend getting Graham Hutton's book, perhaps from a library or electronically. It introduces general tree types in Haskell and also features a very clear explanation of parser combinators and their implementation. There are video lectures based on the book, too, check out its website. I'll change my code (I'm using an additional "complication") to match the style in the book.

Heinrich Apfelmus 2010-08-06 16:35:32

Man I was hoping for a few more answers but it looks like you are the winner. *(You did convince me to learn Haskell so that makes sense.)*Have you looked at the Leksah IDE? It is only version 0.8 but I really like it.

ChaosPandion 2010-08-17 15:20:20

Thanks. :-) By the way, the [email protected] and [email protected] mailing lists as well as the #haskell IRC channel are great resources to learn Haskell, don't hesitate to ask questions there, too. Concerning Lekash, I haven't used it due to compilation problems with Gtk+ on MacOS X. Most people, including me, use a standard text editor for Haskell; works like a charm for me.

Heinrich Apfelmus 2010-08-19 12:05:05

Answer 3

+2 A:

Definitely check out the monadic parser combinator approach; I've blogged about it in C# and in F#.

Brian 2010-08-05 17:04:08

ansaurus

tags:

views:

answers:

What is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner?

related questions