ansaurus

Question

Treat unicode character plus diacritic as a single character?

Answer 1

+3 A:

That's what string.Normalize() takes care of. You can use the Normalize(NormalizationForm) override to control this explicitly.

Hans Passant 2010-08-23 14:55:38

This works for simple accent marks, but some of the more complicated IPA characters are not combined (because there is no single-character representation for them). For example, the ˤ modifier does not get combined. I will update my question to reflect this.

dvcolgan 2010-08-23 17:01:15

How could that be a problem? The words you're trying to compare won't have the combining glyph either.

Hans Passant 2010-08-23 17:16:24

The problem is that the combining glyphs are important information for the purposes of this program, and not having them changes the calculations. ɔ̃ is a completely different character than ɔ.

dvcolgan 2010-08-23 17:42:27

ansaurus

tags:

views:

answers:

Treat unicode character plus diacritic as a single character?

related questions