I'm looking for a decent lexical scanner generator for C#/.NET -- something that supports Unicode character categories, and generates somewhat readable & efficient code. Anyone know of one?
EDIT: I need support for Unicode categories, not just Unicode characters. There are currently 1421 characters in just the Lu
(Letter, Uppercase) category alone, and I need to match many different categories very specifically, and would rather not hand-write the character sets necessary for it.
Also, actual code is a must -- this rules out things that generate a binary file that is then used with a driver (i.e. GOLD)
EDIT: ANTLR does not support Unicode categories yet. There is an open issue for it, though, so it might fit my needs someday.