tags:

views:

20

answers:

1

How can I use Lex/Yacc to recognize identifiers in Chinese characters? Thanks for help.

A: 

I think you mean Lex (the lexer generator). Yacc is the parser generator.

According to http://stackoverflow.com/questions/1366068/, most CJH characters fall in the 3400-9FFF range.

According to http://dinosaur.compilertools.net/lex/index.html

Arbitrary character. To match almost any character, the operator character . is the class of all characters except newline. Escaping into octal is possible although non-portable:

                             [\40-\176]

matches all printable characters in the ASCII character set, from octal 40 (blank) to octal 176 (tilde).

So I would assume what you need is something like [\32000-\117777].

ionut bizau