tags:

views:

29

answers:

2

Hi,

I've a database with collation of utf8 / utf8_bin. DB have arabic text with accent chars (kasar etc). I want to search text without accent chars.

What i did is, i wrote 'like' query ...

WHERE replace(field1,0x[CODE],'') like '%[arabic text]%' --- where [CODE] is the accent char.

This is working fine when i write static utf arabic text in place of field1. But it does not replace when it run from database field.

May be db field have latin1 encoding. How can i fix / check this issue.

Thanks.

+1  A: 

utf8_bin is a binary collation - it is extremely literal and strict in comparing characters.

utf8_general_ci is more lenient and normalizes Umlauts and accents to their "basic" version:

Ä => A

Ü => U etc.

I have never worked with Arabic before so I don't know whether it applies for those accents as well but I would expect so.

You should be able to temporarily use the lenient collation like this:

WHERE field1 like '%[arabic text]%' COLLATE utf8_general_ci;
Pekka
A: 

I finally resolved the issue. Posting it for future reference to help community.

After reading an article, i come to know there are 2 (client & server) encodings used with mysql. My server encoding (table charset/collation) was utf8, but client’s encoding was latin1 and the time of INSERT.

So data stored in DB is somewhat mixed, due to which REPLACE() was not working as expected.

FIX: I've posted it on my blog here ... mysql-issue-with-latin1-multi-byte

Azghanvi