views:

408

answers:

2

hi there

I just transferred some data from MySql to MsSql (2K5) in a text field, some of my characters, such as apostrophes, are now ? (question mark) to me this indicates some sort of collation or character set error, right?

To be honest, I don't know which one should I be using

The MySql db currect charset is utf8_general_ci and in ms sql is SQL_Latin1_General_CP1_CI_AS .

I have tried changing the charset of the mysql table to latin1_swedish_ci, however this doesnt help

Thanks for the input

+1  A: 

Have you tried changing the target (SQL Server) column data type to NVARCHAR?

The utf8_general_ci collation on the MySQL column indicates a Unicode data type. If the source is Unicode, so should be the target - for the easiest transition.

Collations themselves play a minor role here. They just affect comparison and sorting.

Tomalak
hi there , the target was nvarchar all along and it made no difference. thanks for the answer tho
Miau
Can you shed some light on the relevant technical details pf the process? What MySQL/SQL Server version, MySQL ODBC driver version, SSIS package/task settings etc? The question marks clearly indicate a character set conversion at some point in the processing chain.
Tomalak
my sql 5.0.27ssis package settings? which one could be relevant?sql server version 2005mysql ODBC 5.2.2cheers
Miau
Maybe you are processing the data on the way, loosing your Unicode characters there? Is a query or a column transformation involved? Can you double check both columns are Unicode? BTW such technical details should go right to the question so others can see it. Maybe I'm on the wrong track after all.
Tomalak
A: 

You might also need to check the SSIS type of the columns in your dataflow. Remember, the data type and character set is set at the connection manager on the source (and that may involve a conversion from the original native character set). Also, any operations like derived columns or conversions will have a character set which can be altered and will persist down that column's lineage in the data flow. At the end when it gets to the destination, there could be additional character set coercion/conversion.

Cade Roux
The DefaultCodePage of the OLE DB Source/Destination components might also be worth a look.
Tomalak