tags:

views:

48

answers:

1

I am writing a small, really simple lisp parser in ruby with the treetop gem just to experiment with it. However, it is not really working out how I want it to, and the documentation is pretty poor so it's hard to understand what I am doing wrong. Currently, the grammar can match both a symbol and a boolean, but not a number. However, when I switch the order in the atom rule, for example to bool / number / symbol, it still matches for the first two, but not the last one. Is there a limitation in the treetop gem that means you can only have two options in a rule? Also, something like '(3)' still does not parse.

My grammar is as follows:

grammar Lisp
 rule expression
   atom / list
 end

 rule atom
   symbol / bool / number
 end

 rule number
   [0-9]*
 end

 rule bool
   'T' / 'F'
 end

 rule symbol
  [a-zA-Z]*
 end

 rule list
   '(' expression* ')'
 end    
end

I am testing it as they showed in the tutorial, with:

parser = LispParser.new
if parser.parse('T')
  puts "Success"
else
  puts "Fail"
end
+1  A: 

The way you defined the rules number and symbol they always match (because * means "zero or more" and you can always find zero of something). This means that if you try to parse "42", the parser first successfully matches the rule symbol against the empty string at the beginning and then expect no further input.

To fix this simply replace * with +.

sepp2k
Thankyou that works perfectly now.
bennybdbc
@bennybdbc: Also, the way the `atom` rule is defined now, it will never match a boolean, because `T` and `F` also match the regexp in the `symbol` rule, so they will always be parsed as a `symbol`.
Jörg W Mittag
@Jorg - Yes thanks I noticed that earlier. I changed the code in my program, I just didn't think to edit the question.
bennybdbc