views:

575

answers:

6

I have as input blocks of text with commands and arguments, one per line, such as

XYZ ARG1,ARG2,ARG3,...,ARGN

And I want to verify that the arguments to XYZ are well formed for that particular command and execute the correct block of code if they are. There are something like ~100 commands, some of which have variable numbers of arguments, different relationships (i.e. if command XYZ was called then I need to have command ABC called as well).

Also commands exist such as:

COMMAND
XYZ ARG1
BEF ARG1 ARG2
ENDCOMMAND

It is important that the text is contained within COMMAND and ENDCOMMAND.

Typically for something like this I would use Lex and Yacc rather than regex's, but is there anything more modern? The code is written in C#. Is there anything in MSDN that does this rather than old school C Lex and Yacc?

+1  A: 

You have a bigger problem than "age", in that I'm not sure any of the big well-known C-ish compiler-compilers are going to work with C#. The same goes for Boost's newfangled parsing templates.

You are probably going to have to go with something esoteric like Grammatica or Spart (to pick my top two Google hits)

EDIT: After a bit more looking, it appears that ANTLR has support for C#. ANTLR is very well known, and much newer than LEX/YACC, so I'd suggest checking it out.

T.E.D.
Hmm I think you are incorrect.I would just have my C parsing library, a C# wrapper library, and my C# application.
Does that not work? Thanks for the reply, but please tell me why this won't work. Thanks again.
Ah yes, you could indeed make the parser in C and wrap it with C#. Assuming you can find a VisualStudio-compatable version of LEX/YACC. The prebuilts from Gnu generally use the Gnu library formats and require the Gnu linker (ld).
T.E.D.
A: 

There is no special thing in the .NET Framework, if you mean that.

At first glance your command structure looks relatively simple, so manual parsing would well suited here and it is almost always the fasted solution. This also would allow you to check actual values of the command arguments for correctness and not just their syntax validity.

codymanix
+4  A: 

If you are looking for an alternative to Lex/Yacc, check out ANTLR. It supports code generation in a variety of lanagages, including C#.

Ayman Hourieh
Ick. You posted this while I was posting the same thing. My general policy when such things happen is that you (the duper) are clearly a genius, and thus deserve an upvote. :-)
T.E.D.
Hehe, great minds think alike. ;)Thank you!
Ayman Hourieh
+3  A: 

ANTLR can handle both lexing and parsing and it can generate C# (in addition to Java, C++ and Python). It's very mature, has lots of documentation and lots of examples. It also generates much nicer error messages that YACC.

Laurence Gonsalves
Same comment here as with Ayman.
T.E.D.
A: 

For a simple parsing problem like this, you can write a recursive descent parser. Assuming of course, your language is relatively fixed and isn't going to grow into a full programming language. If there's any danger of that, bite the bullet and use ANTLR or equivalent.

Paul Hankin
+1  A: 

Take a look at jay, yacc retargeted to C# and Java. It is included in the mono project.

http://code.google.com/p/jayc/

sumek