ansaurus

Question

SQL Server Collation / ADO.NET DataTable.Locale with different languages

Answer 1

+1 A:

Yes, the problem is most likely the collation. The Latin1_General collation does not include the rules to sort and compare non latin characters.

MSDN claims:

If you must store character data that reflects multiple languages, you can minimize collation compatibility issues by always using the Unicode nchar, nvarchar, and ntext data types instead of the char, varchar, text data types. Using the Unicode data types eliminates code page conversion issues.

Since you have already complied with this, you should read further on the info about Mixed Collation Environments here.

Additionally I want to add that just changing a collation is not something done easy, check the MSDN for SQL 2000:

When you set up SQL Server 2000, it is important to use the correct collation settings. You can change collation settings after running Setup, but you must rebuild the databases and reload the data. It is recommended that you develop a standard within your organization for these options. Many server-to-server activities can fail if the collation settings are not consistent across servers.

You can specify a collation on a per column bases however:

CREATE TABLE TestTable (
   id int,  
   GreekColCaseInsensitive nvarchar(10) collate greek_ci_as,
   LatinColCaseSensitive nvarchar(10) collate latin1_general_cs_as
   )

Have a look at the different binary multilingual collations here. Depending on the charset you use, you should find one that fits your purpose.

If you are not able or willing to change the collation of a column you can also just specify the collation to be used in the query like:

SELECT * From TestTable 
WHERE GreekColCaseInsensitive = N'test - ۓےۑ'
COLLATE latin1_general_cs_as

As jfrobishow pointed out the use of N in front of the string you want to use to compare is essential. What does it do:

It denotes that the subsequent string is in Unicode (the N actually stands for National language character set). Which means that you are passing an NCHAR, NVARCHAR or NTEXT value, as opposed to CHAR, VARCHAR or TEXT. See Article #2354 for a comparison of these data types.

You can find a quick rundown here.

ntziolis 2010-04-30 09:48:27

First of all, I'm feeling a little stupid. Then THANK YOU VERY MUCH. After your answer (and jfrobishow's one) I found out that I was misled by an old Data Layer framework, not Unicode aware, that didn't put that simple N in front of the string; the human (me) made his part by overlooking at the way the query was constructed, even because the actual case was much more complex than the one showed here. This simple thing fixed everything: thank you again!

Turro 2010-05-06 10:26:19

Answer 2

+2 A:

Shouldn't you use N when comparing nvarchar with extended char. set?

SELECT * From TestTable WHERE GreekColCaseInsensitive = N'test - ۓےۑ'

jfrobishow 2010-04-30 13:45:18

You are absolutely correct. In my sample I was focusing on the possibility to specify the collation for a specific query. I added it now for completeness though.

ntziolis 2010-04-30 16:10:05

THANK YOU VERY MUCH for your answer: at the end I choose ntziolis one because of quickness and completeness. Still I upvoted also your one, because you correctly pointed out what I was doing wrong. Well, if you come to Italy, I'll offer you a beer :)

Turro 2010-05-06 10:28:52

Glad you got it sorted out :)

jfrobishow 2010-05-06 10:44:52

ansaurus

tags:

views:

answers:

SQL Server Collation / ADO.NET DataTable.Locale with different languages

related questions