ansaurus

Question

Recognizing extended characters using JAVACC

Answer 1

+1 A:

It turns out that what I wanted my grammar to do was to accept all valid Unicode characters and not ASCII characters, the ™ symbol is part of the Unicode specification and not in an ASCII extended character set. Changing my token for a valid character as outlined below solved my problem: (A valid unicode being of the format- U+00FF)

< CHARACTER:(   (~["'"," ","\\","\n","\r"])
| ("\\"
    ( ["n","t","b","r","f","\\","'","\""]
    | ["u","U"]["+"]["0"-"9","a"-"f","A"-"F"]["0"-"9","a"-"f","A"-"F"]["0"-"9","a"-"f","A"-"F"]["0"-"9","a"-"f","A"-"F"]
    )
  ) )>

RGordon1982 2009-04-20 17:06:19

ansaurus

tags:

views:

answers:

Recognizing extended characters using JAVACC

related questions