views:

226

answers:

2

Hey, Trying to solve this for a very long time now...

SELECT MATCH(name) AGAINST('абраксас')

(russian) doesn't work, but

SELECT MATCH(name) AGAINST('abraxas')

(english) work perfectly.
I know it's something with character-set, but I tried all kind of settings and it didn't work.
For now it's latin-1.
LIKE works
This is the show variables charset related:

character_set_client - latin1
character_set_connection - latin1
character_set_database - latin1
character_set_filesystem - binary
character_set_results - latin1
character_set_server - latin1
character_set_system - utf8
character_sets_dir - /usr/share/mysql/charsets/
collation_connection - latin1_swedish_ci
collation_database - latin1_swedish_ci
collation_server - latin1_swedish_ci

chunk of /etc/my.cnf

default-character-set=latin1
skip-character-set-client-handshake

chunk of the dump:

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
DROP TABLE IF EXISTS `scenes_raw`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `scenes_raw` (
`scene_name` varchar(40) DEFAULT NULL,
...blabla...
) ENGINE=MyISAM AUTO_INCREMENT=901 DEFAULT CHARSET=utf8;

(I did tests without skip-character-set-client-handshake too)

SHOW TABLE STATUS WHERE Name = 'scenes_raw'\G
Name: scenes_raw
Engine: MyISAM
Version: 10
Row_format: Dynamic
Index_length: 23552
Collation: utf8_general_ci
Checksum: NULL
Create_options:
A: 

try to call set names 'utf8' once you open a connection to the server.

Omry
nope, it messes the encoding of the result and the score is still 0 :(
Devenv
are you sure your data (both on db and on the queries) is actually in utf8?
Omry
well, in dump it shows DEFAULT CHARSET=utf8 and LIKE works... so I guess it is :)
Devenv
+2  A: 

Try setting your character_set_client to UTF-8. I just tried it out:

> mysql -u root -p charset_test --default-character-set=latin1
mysql> select * from scenes_raw where MATCH(scene_name) AGAINST('абраксас');
Empty set (0.00 sec)

> mysql -u root -p charset_test --default-character-set=utf8
mysql> select * from scenes_raw where MATCH(scene_name) AGAINST('абраксас');
+------------------+
| scene_name       |
+------------------+
| абраксас         | 
+------------------+
1 row in set (0.00 sec)

Edit: For more details, see this explanation. The "character_set_client" from your "SHOW VARIABLES" output is the character set your MySQL client is using to send to the server. When it is in Latin-1, it does not support Cyrillic, but when it is UTF-8, it should work fine.

Sasha
Ok, I had default-character-set=latin1 and skip-character-set-client-handshake in my.cnf, which is not ok of course, but now I have a problem that the text in dump is gibberish when I use charset=utf8, and as far as I remember it's a lot of pain to convert it to utf all at once :( bummer, thanks for help though, it's not the final solution for me but it helped to find the source of the problem...
Devenv
What's weird is that in dump file it seems utf8, and all the collations etc are utf8... but the text itself is latin1, very confused :/
Devenv