What do I use to perform a case-insensitive comparison on two UTF-8 encoded sub-strings? Essentially, I'm looking for a strnicmp
function for UTF-8.
views:
47answers:
2
+1
A:
strcoll
should be locale aware and treat correctly UTF8 at least if it is the default encoding of the locale. If it is not, I have still no idea. Like a workaround, you can convert the multibyte string into wchars (mbrtowc) and then use wcscasecmp which unfortunately is a GNU extension, not a part of standard libraries... Not so useful maybe.
ShinTakezou
2010-06-01 21:36:46
Is strcoll case-insensitive? Is there a way to specify the maximum number of characters to compare?
Jen
2010-06-01 21:51:28
no, it should be like strcmp, case sensitive sorry.
ShinTakezou
2010-06-02 12:22:55
+2
A:
Case conversion rules in various Unicode scripts are murderously difficult, it requires large case conversion tables. You cannot get this right yourself, you'll need a library. ICU is one of them.
Hans Passant
2010-06-01 22:06:34