How can I use Lex/Yacc to recognize identifiers in Chinese characters? Thanks for help.
A:
I think you mean Lex (the lexer generator). Yacc is the parser generator.
According to http://stackoverflow.com/questions/1366068/, most CJH characters fall in the 3400-9FFF
range.
According to http://dinosaur.compilertools.net/lex/index.html
Arbitrary character. To match almost any character, the operator character . is the class of all characters except newline. Escaping into octal is possible although non-portable:
[\40-\176]
matches all printable characters in the ASCII character set, from octal 40 (blank) to octal 176 (tilde).
So I would assume what you need is something like [\32000-\117777]
.
ionut bizau
2010-07-08 16:01:11