ansaurus

Question

How to match optional open/close tags in JavaCC?

Answer 1

+2 A:

Your grammar is ambiguous. This is probably not your fault, as it will probably be very difficult to produce an unambiguous grammar for the problem you are trying to solve.

An LL(k) parser is probably not the best tool for this job.

However, the tokenizer may be useful, and using a stack to find matching and unmatching pairs of tags may be a suitable alternative.

jamesh 2010-10-06 22:15:59

I reached the same conclusion, but I find JavaCC a bit of overkill for just a tokenizer. Do you know of a 100% Java tokenizer? (So I don’t need extra build tools?)

Kdeveloper 2010-10-07 12:34:39

Answer 2

+1 A:

Some time ago I've learnt, that some trivial problems can be easily solved at the semantic or lexical level while proving to be very difficult or impossible at the syntactic level.

Note: I'm not too familiar with JavaCC, but I've worked with multiple compiler generators in the past (my favorite being sablecc).

You could probably just define your "content" as something like this:

(text()|boldstart()|boldend()|invalidTag)

Where boldstart() would just blindly output start tag and boldend() - an end tag.

If however you want to filter all that and only produce correctly ended tags, then I'd suggest making some sort of stateful automaton for that, feed it opening and ending tags, note if (say) bold should start, stop or continue (possibly including depth of nesting) and depending on that output either start, stop or no tag. This would be really easy to implement as opposed to using syntactic or lexic tools you have in JavaCC.

inkredibl 2010-10-07 08:47:31

Thanks, do you have a link too examples on this? Or know of a Java tokenizer tool?

Kdeveloper 2010-10-07 12:37:08

ansaurus

tags:

views:

answers:

How to match optional open/close tags in JavaCC?

related questions