views:

154

answers:

3

I have to support an outdated CMS, which has some parts written in ASP, some in PHP and uses SQL Server for back end. Virtually anything in that system is written with two supported input languages in mind- Latvian and English. Therefore, it uses windows-1257 encoding in all web pages, that use that CMS, and also all admin pages. In database, the default collation is Latvian_CI_AS.

Now the owner of the system wants to support also Russian language, and IMO the best way to go is to convert everything to utf-8.

The big question is- how to convert everything which is stored in database to utf-8? My background is MySQL and I'm by no means proficient in SQL Server, so I don't know how to change the collation for the whole database. Do I have to fetch all the data from database, convert with iconv to utf-8, and push that data back in database?

I understand I would have to change the encoding for all client web pages and scripts, but the main concern for me is the database.

A: 

Your question is not clear enough. Do you

1) want to make a separate copy of your database for your Russian customers?

2) want to support Russian in the same DB that already supports English and Latvian ?

So here is my answers for both

1) In SQL Management Studio right click the database, select Tasks -> Generate scripts -> Script all objects in selected database = true, Finish. script all database objects. then in any text editor open this script and perform replace of strings 'Latvian_CI_AS' to 'Cyrillic_General_CI_AS'

NOTE: database collation can be changed with ALTER DATABASE but it will not change collations of all existing columns if they have non-default collations.


2) If you would like to have full Unicode support in your database (so Latvian, English and Russian words can be stored in the same column), you need to convert all VARCHAR , CHAR and TEXT fields to NVARCHAR, NCHAR, NTEXT

In this case I will also recommend you to create database script, replace VARCHAR to NVARCHAR , CHAR to NCHAR, TEXT to NTEXT

Then with the help of Tasks/Import data... wizard you will transfer data from old database to new one (if you need this data).

Another possibility, if you have a few tables, just go to SQL Management Studio and change types to Unicode types manually.

One could say it is also possible to do with alter table t alter column name nvarchar(size) script but if you have any constraint or defaults attached to this column, you will get an error ALTER TABLE ALTER COLUMN name failed because one or more objects access this column. and you will need to drop/create constraints in your script and this could be a nightmare...

As CMS developer you will have to copy the database in any case, because when you convert everything in DB to unicode, I bet that in some places ASP/PHP code may stop working as expected. So having old and new copies will allow you to debug and fix problems one by one.

Bogdan_Ch
A: 

Thanks for the reply. I'm sorry, that my question was too vague, but basically, I'd like to convert whole database to unicode- I'm not sure the customers will stop at russian language. They might require some german symbols or something like that, so I better convert whole database to unicode and that's it.

Please update/edit your question with this new information instead of putting it as answer.
random
so, my answer is in section 2) should work. you can re-create the database scripts and you will have a new database. If you have a few tables, you could change column types manually, table by table in SQL Management Studio
Bogdan_Ch
A: 
SergeyKazachenko