How to handle the inability to break on syntax errors in a recursive descent parser

I'm currently in a class on systems software development. We are writing the two-pass assembler for the assembly language of a fictional machine. We've implemented the tokenizer, and all of the classes that we need to abstractedly represent this program - all that is left (besides implementing the code generator in a later phase) is to parse the tokens. Here is where I'm having a major issue. I'm choosing to implement this as a recursive descent parser, since that's the only technique I currently have experience with...but we are not allowed to stop assembly on syntax errors. For instance, if the user gives a load word instruction with invalid syntax, we are to replace it with a NOP. If the user gives a bad label, we are to simply ignore it. If the user places unknown characters in a line, we discard them.

On the one hand, it sounds easy - however, implementing this causes me to break (what I understand to be) one of the important rules of a recursive descent parser. Each of my functions pulls multiple tokens before calling another function, since I need to account for all of the possible fixable syntax errors. Given that I can't stop assembly, and I must have enough information about my current context to intelligently determine what the user was intending to do, I have to handle a lot within one function.

This turns the program from a true recursive descent parser into more of a semi-finite-state-machine. I feel like I'm doing this badly, but I'm not sure how else to implement this. Does anyone have any suggestions/ideas?

BTW - I'm not allowed to use tools like ANTLR, or any other parser generator.

Thanks.

ansaurus

tags:

views:

answers:

How to handle the inability to break on syntax errors in a recursive descent parser

related questions