tags:

views:

21

answers:

1

My Win32/MFC program builds up a list of names, sorting them alphabetically as it puts them into the list. When it supported only ASCII strings, this worked by a simple char-by-char string comparison. But now that I want to accept UTF-8 strings, I need a more complex scheme since --for example -- all forms of the letter "a" should be equivalent from an alphabetizing standpoint.

Is there a function somewhere that can do this, or will I have to craft my own comparison table to sort these strings?

A: 

The CompareStringEx Function probably does what you need.

But note that this function (and the Windows API in general) does not use the UTF-8 encoding to represent unicode strings. Instead, it uses the UTF-16 encoding (aka "wide character strings"). You might just be confusing the UTF-8 encoding with unicode in general. But if you are really dealing with UTF-8 encoded strings then you can do the conversion from UTF-8 to wide character strings with the MultiByteToWideChar Function.

Wim Coenen
Thanks for the excellent response! I am definitely dealing with UTF-8 here, as I'm taking strings from a GEDCOM file (a genealogy standard) that are encoded in UTF-8 and am creating a SQLite database for use on an Android device. Both SQLite and Android use UTF-8 as a standard.
gordonwd