tags:

views:

379

answers:

3

I have the following problem (using mysql 5.0.70): In one table I have a varchar field containing some kind of a number - "0303A342", "21534463", "35663CE3", etc. Collation is set to utf8_general_ci.

The problem shows up when a user of the system is trying to search for a record containing part of this number. SQL query looks like "...WHERE 'number' LIKE '%0303A%'" Now if the 'A' in the LIKE part is entered as a Latin A, the result contains only records with Latin A`s in them -- as it should. And when the A is Cyrillic, the results are again only those rows containing the Cyrillic A. There are many other letters like E, C, B, T and so on.

Now my question is, if there is a way to modify my sql query so it returns all rows matching the LIKE '%0303A%' part but for all kind of A`s in there? Or I should convert the user input before inserting/updating the database?

+1  A: 

You should convert the user input, there is no function LOOKS LIKE in MySQL :)

You can store the transliterated string along with the original one and use php::translit to do this:

id data    trans_data
1  Москва  MOSKVA
2  София   SOFIA
SELECT  *
FROM    table
WHERE   trans_data LIKE CONCAT('%', ?, '%')

mysqli->bind_param('s', strtoupper(transliterate('Москва')));
Quassnoi
yes, "LOOKS LIKE" is exactly what I'm looking for :)
matix
A: 

The fact that some letters look alike in several alphabets does not make them identical. In fact their looking alike might even depend on the font used to display them.

Your example data looks like hexadecimal values. Without knowing more about the context of your problem, might it be possible to have the user enter the data in decimal format (just using digits)? Another possibility might be a custom control to enter the search criteria. Does it have to be free text or would maybe a listbox with several options suffice?

If not, you might find functions similar to what Quassnoi describes (transliterate()) in the language or library you are using. With plain MySQL you are probably out of luck. And even if you find some function that does what you want, make sure it supports all the languages/alphabets your users are going to use - I for one would not know how many alphabets there are that have 1:1 mappings of their characters to the latin ones.

Daniel Schneller
The data is just string containing digits, letters and even some dashes maybe. Nothing like hex values or other structure..
matix
A: 

I can't use the php translit - the data I'm dealing with (the 'numbers' as I call them) are some kind of an identifiers, so it doesn't make sense to search for "IA" if somebody enters "Я". Because I know my users will use only Latin and Cyrillic characters, what I'm going to do is make a list of the common letters in the two alphabets and convert user input before reaching the db. Kind of sucks but seems the most appropriate solution.

matix