collation

UTF-8: General? Bin? Unicode?

I'm trying to figure out what collation I should be using for various types of data. 100% of the content I will be storing is user-submitted. My understanding is that I should be using UTF-8 General CI (Case-Insensitive) instead of UTF-8 Binary. However, I can't find a clear a distinction between UTF-8 General CI and UTF-8 Unicode CI. ...

Appropriate character encoding / collation to store URLs?

My web application stores URL segments in a database. These URL segments are based on user-submitted content. What collation should I use for character strings that will appear in URLs? My assumption is ASCII General CI (?) based on this question: http://stackoverflow.com/questions/1547899/which-characters-make-a-url-invalid ...

Any python/django function to check whether a string only contains characters included in my database collation?

hi, As expected, I get an error when entering some characters not included in my database collation: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='") Is there any function I could use to make sure a string only contains characters existing in my database collation? th...

Execute sqlite3 "dot" commands from Python or register collation in command line utility

My sqlite3 database contains a "collate" column-constraint. I've placed it in the schema for the table, to prevent accidentally neglecting to use the necessary collation. However this means when running sqlite3 from the command line, and not from my Python code, the collation referenced in the schema is not present, and I'm unable to use...

Accent-insensitive sorting in MySQL

I am trying to achieve accent and case-insensitive sorting in MySQL. Following the instructions in the manual, this is supposed to work with the utf8 character set and utf8_general_ci collation. When I follow the example in the manual (http://dev.mysql.com/doc/refman/5.1/en/charset-collation-implementations.html) under "Collations for ...

Change entire db's collation and solve illegal mix of collations

Hi I'm having a problem when doing LIKE '' queries in mySQL These are my variables character_set_client utf8 character_set_connection utf8 character_set_database latin1 character_set_filesystem binary character_set_results utf8 character_set_server latin1 character_set_system utf8 character_sets_dir C:\xampp\mysql\share\charsets\ 1...

Inconsistent MySQL COLLATE errors across databases

I have two physically-separate MySQL databases on which I have to run a single query. The query has a section of SQL that looks like this: and foo_table.bar_column like concat('%', rules.pattern, '%') COLLATE utf8_general_ci It runs fine on database A but on database B I get this error: ERROR 1253 (42000): COLLATION 'utf8_general_ci...

How do I perform an accent insensitive compare (e with è, é, ê and ë) in SQL Server?

I'm looking to compare two varchars in SQL, one would be something like Cafe and the other Café is there a way in SQL that will allow the two values to be compared. For instance: SELECT * FROM Venue WHERE Name Like '%cafe%' So if there is a venue with the name Big Bobs Café Extraordinaire it would be included in the result set? ...

MySQL Collation or PHP side to display accented letters properly

What is the best Collation for the column that can allow to store accented letters and parse them out perfectly without any encoding error, because whenever I add an accented letter such as é, å, it shows out with an encoding problem on the PHP side, but in the MySQL side it's fine... How do I get the accented letters display properly? ...

Hmm, why finding by '2' or '2' return the same record?

Hi everyone, forgive my newbie question, but why finding by '2' or '2' in Mysql returns the same record? For example: Say I have a record with string field named 'slug', and the value is '2'. And the following SQLs returns same record. SELECT * From articles WHERE slug='2' SELECT * From articles WHERE slug='2' ...

SQL Collation & Datatype: Support Both Western and Arabic data in a field

I have a Delphi + SQL Server (2k or 2005 supported) app that is used by both western and Arabic users. For some fields (i.e. name) my app needs to be able to support both Arabic language and western language characters. Is it possible to set a single collation & datatype for a field to handle either English or Arabic data? NB: I do not...

MySQL treats ÅÄÖ as AAO?!

These two querys gives me the exact same result: select * from topics where name='Harligt'; select * from topics where name='Härligt'; How is this possible? Seems like mysql translates åäö to aao when it searches. Is there some way to turn this off? I use utf-8 encoding everywhere as far as i know. The same problem occurs both from t...

SQLAlchemy custom sorting algorithms when using SQL indexes

Is it possible to write custom collation functions with indexes in SQLAlchemy? SQLite for example allows specifying the sorting function at a C level as sqlite3_create_collation(). An implementation of some of the Unicode collation algorithm has been provided by James Tauber here, which for example sorts all the "a"'s close together wh...

Mysql storing quotes as '

I have some PHP code which stores whatever is typed in a textbox in the databse. If I type in bob's apples, it gets stored in the database as bob's apples. What can be the problem? The table storing this has the collation of latin1_swedish_ci. ...

Comparing ICU sort keys (collator_get_sort_key) in PHP

Is strcmp() appropriate for comparing ICU collator sort keys in PHP? The sort keys I'm asking about are from collator_get_sort_key() which are described in ICU documentation. ...

SQL Server -> 'SQL_Latin1_General_CP1_CI_AS' Collation -> Varchar Column -> Languages Supported

All, We are using SQL Server 2008 with Collation Setting as 'SQL_Latin1_General_CP1_CI_AS'. We are using Varchar column to store textual data. We know that we cannot store Double Byte data in Varchar column and hence cannot support languages like Japanese and Chinese without converting it to NVarchar. However, will it be safe to say t...

which of utf8 collations is the best?

i want a utf8 collation for supporting all the below languages : English , Persian , Arabic , French , japanese , chinese I heared that UTF8_GENERAL_CI can support all these Langs,is it right? ...

SQL Server Collation / ADO.NET DataTable.Locale with different languages

Hi all, we have WinForms app which stores data in SQL Server (2000, we are working on porting it in 2008) through ADO.NET (1.1, working on porting to 4.0). Everything works fine if I read data previsouly written in Western-European locale (E.g.: "test", "test ù"), but now we have to be able to mix Western and non-Western alphabets as we...

phpmyadmin shows numbers or blob for mysql's utf8_bin callation columns?

Hi ! I have a table with a varchar column. Its collation is set to utf8_bin. My software using this table and column works perfectly. But when I look at the content in phpmyadmin, I only see some hex values or [Blob xB]. Can I make phpmyadmin show the content correctly? Besides, when I set the collation to utf8_general_ci or utf8_unico...

Mysql german accents not-sensitive search in full-text searches

Let`s have a example hotels table: CREATE TABLE `hotels` ( `HotelNo` varchar(4) character set latin1 NOT NULL default '0000', `Hotel` varchar(80) character set latin1 NOT NULL default '', `City` varchar(100) character set latin1 default NULL, `CityFR` varchar(100) character set latin1 default NULL, `Region` varchar(50) charact...