views:

63

answers:

2

I have an ANTLR grammar and I would like to fuzz my parser.

+1  A: 

Are you looking for generation from a CFG grammar? Ie. the generation of strings that are accepted by the grammar? This could be a good idea to check for grammar correctness, but keep in mind that the set of accepted strings is most probably infinite. Any really bad bugs should already be apparent in the grammar specification, and hopefully by the checking of LL-ness.

I dont know of any tool in the ANTLR world, neither did a quick google search on (E)BNF generation reveal anything useful.

It is, however, not very difficult to roll your own generator if performance and such is not an issue. Prolog would spring to mind, there are loads of litterature available, but if you do not want to leave Java, i suspect homebrewing is the way to go. Its fun anyway.

johanbev
A: 

Assume you generated sentences (strings of tokens) from your ANTLR grammar. Why do you think your ANTLR-based parser would object to them?

What you really have to do is to produce not-quite-legal strings. So, what you need is a generator that can produce erroneous strings.

Given that ANTLR generates a set of procedures from your ANTLR grammar, I think it would be difficult to produce a sentence-generator. What you need is the explicit model of the grammar, e.g., the ANTLR input.

An additional complication I see is generation of legal tokens from the regexes that make up the token definitions. Again, you'd need to process the ANTLR input to do this.

These both seem technically straightforward. The best engine to use as a foundation is likely the ANTLR front end, which obviously parser ANTLR specs, and so must hold some representation of the ANTLR input.

Ira Baxter