views:

47

answers:

2

What do I use to perform a case-insensitive comparison on two UTF-8 encoded sub-strings? Essentially, I'm looking for a strnicmp function for UTF-8.

+1  A: 

strcoll should be locale aware and treat correctly UTF8 at least if it is the default encoding of the locale. If it is not, I have still no idea. Like a workaround, you can convert the multibyte string into wchars (mbrtowc) and then use wcscasecmp which unfortunately is a GNU extension, not a part of standard libraries... Not so useful maybe.

ShinTakezou
Is strcoll case-insensitive? Is there a way to specify the maximum number of characters to compare?
Jen
no, it should be like strcmp, case sensitive sorry.
ShinTakezou
+2  A: 

Case conversion rules in various Unicode scripts are murderously difficult, it requires large case conversion tables. You cannot get this right yourself, you'll need a library. ICU is one of them.

Hans Passant