I am using posix c regex library(regcomp/regexec) on my search application. My application supports different languages including those that uses multi-byte characters. I'm encountering a problem when using word boundary metacharacter (\b). For single-byte strings, it works just fine, e.g:
"\bpaper\b" matches "paper"
However, if the regex and query strings are multi-byte, it doesn't seem to work correctly, e.g:
"\b紙張\b" doesn't match "紙張"
Am I missing something? Any help would be highly appreciated.
Requested Info:
- Programming Language: C
- Regex Library: GNU C (regex.h)
Thanks.