I read up someplace, that there are characters other than A-Z that have a lowercase equivalent, in Unicode. Which could these be, and why would any other character need an upper and lower case?
Any letter with an accent could potentially have different code point, or be a combination of more than one code point. For example, ÂËÕÝ are uppercase characters with lowercase equivalents.
The key is to implement the standards faithfully with respect to your users' locale settings, or get the same effect by using system libraries that handle the general case of toupper()/tolower() correctly.
There's a lot of alphabets other than the usual Latin-derived western European alphabet most of us are used to seeing here. To start with, you'd need uppercase and lowercase versions of accented letters and ligatures, like Àà, IJij, and so on. There's also the fullwidth versions of Latin characters used when setting documents in Asian languages (which I'm too lazy to look up). Further, there are the other alphabets in use nowadays, like the Cyrillic (Бб) and Greek (Δδ) alphabets.
There's also Turkey, which is just kind of difficult according to Jeff Atwood. Using the uppercasing/lowercasing functions provided by your environment are (usually) the way to go with user-input data.
The English language, and even that strange variant, American English :-) , is not the only language on the planet. There are some very strange looking ones (at least to those familiar with the Latin-based characters) but even Latin-based ones have minor variations.
Two of which I am acquainted with on more than a casual basis are Greek and German:
Αα Ββ Γγ Δδ Εε Ζζ Ηη Θθ Ιι Κκ Λλ Μμ
Νν Ξξ Οο Ππ Ρρ Σσς Ττ Υυ Φφ Χχ Ψψ Ωω
Aa Ää Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn
Oo Öö Pp Qq Rr Ss ß Tt Uu Üü Vv Ww Xx Yy Zz
That's why we're not allowed to use bits of code like:
char lower = upper - 'A' + 'a';
any more. Doing something like that in a company that takes i18n seriously is near grounds for dismissal. Using Unicode-aware toLower()/toUpper()
-type functions is the better way to go.