ansaurus

Question

Problem with ä,ö, å etc. characters in MySQL 5

Answer 1

+6 A:

MySQL's uses collations to compare character values.

Collations are the sets of rules used by database to define which characters are different and which are not when comparing.

Case sensitive collations distinguish between 'QUERY' and 'query', case insensitive do not.

Accent sensitive collations distinguish between 'résumé' and 'resume', accent insensitive do not.

In your column's default collation (most probably UTF8_GENERAL_CI), umlauted characters are indistinguishable from non-umlauted:

SELECT 'a' LIKE '%ä%'
---
  1

To distinguish between them, use binary collation (which treat all characters with different unicodes as different characters):

SELECT 'a' LIKE '%ä%' COLLATE UTF8_BIN
---
  0

Quassnoi 2009-08-03 12:33:45

Answer 2

+2 A:

Note: for many applications, collating a and ä as the same letter is considered a feature. My suggestion: be sure to double-check with your clients to determine which behavior is desired.

I might even follow up with a memo that says, "As we discussed on x date, the system will sort and find characters as follows..."

Larry K 2009-08-03 12:37:18

Yes, this is true. This is done by google, so "ż", "ź" and "z" are all treated as "z", and this makes searching polish texts possible. The point is some languages like polish similar words rarely have totally different meaning, that can be altered by scratching national characters (usually context make possible to understand).

smok1 2009-08-03 12:51:19

ansaurus

tags:

views:

answers:

Problem with ä,ö, å etc. characters in MySQL 5

related questions