views:

139

answers:

0

I have a database with collation SQL_Latin1_General_CP1_CI_AS. In that database I have a varchar field. There is a row in that database with the string "ó" (single character 243 in codepage 1252). I have a simple ASP page that sets the codepage to 65001, reads that row (using adodb), and sends it out to the browser. Everything works fine if the "current language for non-unicode programs" is set to English. If I change that to Russian and browse to the page I see "o". I can set a breakpoint in the server side asp page and it appears that ado is returning "o" instead of "ó".

Why does the "current language for non-unicode programs" matter? The database has the data and is configured for the proper code page. I thought that internally ADO and VBScript stored everything as unicode. It appears that somewhere the string is being converted to the codepage specified in "current language for non-unicode programs" but even that doesn't make much sense as I would expect to see "?" instead of "o" (but I don't really understand what handles the conversion from one codepage to another and what rules it uses).

I understand that changing the column to nvarchar may help but that doesn't explain why this is happening.

(edit) I understand why "ó" is being converted to "o". Windows Best Fit http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1251.txt

Still trying to figure out how to get the codepage 1252 string out of SQL and into VBScript without loss.