views:

198

answers:

3

Hi there,

I have a database in SQL Server containing a column which needs to contain Unicode data (it contains user's addresses from all over the world e.g. القاهرة‎ for Cairo)

This column is an nvarchar column with a collation of database default (Latin1_General_CI_AS), but I've noticed data inserted into it via SQL statements containing non English characters and displays as ?????.

The solution seems to be that I wasn't using the n prefix e.g.

INSERT INTO table (address) VALUES ('القاهرة')

Instead of:

INSERT INTO table (address) VALUES (n'القاهرة')

I was under the impression that Unicode would automatically be converted for nvarchar columns and I didn't need this prefix, but this appears to be incorrect.

The problem is I still have some data in this column which appears as ????? in SQL Server Management Studio and I don't know what it is!

Is the data still there but in an incorrect character encoding preventing it from displaying but still salvageable (and if so how can I recover it?), or is it gone for good?

Thanks,

Tom

+1  A: 

SSMS sometimes won't display all characters, I just tried what you had and it worked for me, copy and paste it into Word and it might display it corectly

Usually if SSMS can't display it it should be boxes not ?

SQLMenace
Word displays exactly the same as SSMS: "??????". In other tables where I am inserting unicode data correctly I do see Arabic characters.
Loftx
Then you are probably out of luck
SQLMenace
I had to +1 @SQLMenace's comment.. since that's what I think is the answer is to Loftx's question unfortunately...
Sung Meister
A: 

Try to write a small client that will retrieve these data to a file or web page. Check ALL your code if there are no other inserts or updates that might convertthe data to varchar before storing them in tables.

Piotr Rodak
+1  A: 

To find out what SQL Server really stores, use

SELECT CONVERT(VARBINARY(MAX), 'some text')

I just tried this with umlauted characters and Arabic (copied from Wikipedia, I have no idea) both as plain strings and as N'' Unicode strings.

The results are that Arabic non-Unicode strings really end up as question marks (0x3F) in the conversion to VARCHAR.

devio