views:

240

answers:

2

I want to parse a continuous stream of bytes (from a socket) with a state machine using Ragel

However, all the Examples I have found are either parsing a complete file in one pass (like the Gherkin lexer or are using Ragels C Target (like the mongrel HTTP1.1 Parser)

I'm looking for some advice or examples on how to instantiate a Ragel State machine and then add bytes to it, keeping the existing state intact.

The final interface I am looking for is something like:

parser = MyStreamParser.new(Grammar)
parser.on_token { |t| puts t.inspect }

# I can't parse lines seperately because tokens can span multiple lines.
$stdin.each_line do |line|
  parser.add(line)
end

Any advice on how to do that in Ragel is greatly appreciated. I'd rather use that than code another state machine by hand.

Maybe Ragel is not the right tool? If not: What should I use instead?

A: 

It may not be exactly what you are looking for, but Dhaka is another decent parser generator to take a look at. I'm not sure that will help, but it has served me well in the past.

David Hollman
+1  A: 

At first glance, Ragel doesn't look very Ruby-like. Have you taken a look at Statemachine? It looks like you can feed the state machine events (characters, in your problem) one at a time.

Wayne Conrad
Oh, yeah. Statemachine rocks. I'm replacing one of work's hand-made state engines with Statemachine right now.
Wayne Conrad
Statemachine is nice for what it does, but it (I believe) does not have the features i'd like (I would afaict still need a lexer to convert my byte stream into tokens and it can't do things like lookahead, priorities, semantic conditions unless I build them myself)
levinalex
Fair enough. Sorry it wasn't helpful to you. But thanks for asking a cool question.
Wayne Conrad