views:

74

answers:

1

I'm trying to internationalize the questions in our survey-tool, but when I insert some translated strings, SQL-server seems to strip of some, but not all, diacritics...

Example: (Lithuanian)

Ar jūsų darbas reikalauja, kad jūs įgytumėte naujų žinių ir įgūdžių?

Becomes

Ar jusu darbas reikalauja, kad jus igytumete nauju žiniu ir igudžiu?

Notice the 'z' has kept its diacritic, while the 'u', 'i' and 'e' has lost theirs. The table column that keeps the text is nvarchar, however the table collation is 'Danish_Norwegian_CI_AS'.

Any suggestions?

EDIT 2010.08.16 11:17:

Ok. I might have narrowed something down. It seems that the stored procedure I use to extract the sentence from the db is the one performing the stripping. It selects from several sources, all of which are nvarchar using a UNION to get everything into the same query. Somewhere in there the characters are stripped.

... Hold on... I think I might have fracked up something along the way...

A: 

The collation settings won't affect data stored in columns of type UTF-8. I would change the codepage and encoding of your file to UTF-8, and ensure that your table is storing text as UTF-8, and you should be all set.

NinjaCat
SQL SMS now returns the correct values, but both ASP 3.0 and ASP.NET 3.5 returns the string without the "more exotic" diacritics. I have checked my settings and all the pages are saved in UTF-8. The response headers also report the correct content-type...
Christian W
Couple things to look at: 1) What is your context type header, and 2.) Are your pages saved in UTF-8 encoding? From firefox, right click and look at View Page Info will get you this info...
NinjaCat
Content-Type:text/html; charset=utf-8Page is saved as UTF-8 according to Visual Studio. Still no luck. Now I even have encoded the difficult ones. The source string now says:Ar jūsų darbas reikalauja, kad jūs įgytumėte naujų žinių ir įgūdžių?And I know this one is correct because if I create a normal HTML-file with this content the sentence is displayed correctly...
Christian W
When you display the HTML encoded characters like you have in this comment, how does it appear?
NinjaCat
It appears correctly. However, see edit in question.
Christian W