views:

286

answers:

3

I set the culture to Hungarian language, and Chr() seems to be broken.

System.Threading.Thread.CurrentThread.CurrentCulture = "hu-US"
System.Threading.Thread.CurrentThread.CurrentUICulture = "hu-US"

Chr(254)

This returns "ţ" when it should be "þ"

However, Asc("ţ") returns 116.

This: Asc(Chr(254)) returns 116.

Why would Asc() and Chr() be different?

I checked and the 'wide' functions do work correctly: ascw(chrw(254)) = 254

A: 

It sounds like you need to set the code page for the current thread -- the current culture shouldn't have any effect on Asc and Chr.

Both the Chr docs and the Asc docs have this line:

The returned character depends on the code page for the current thread, which is contained in the ANSICodePage property of the TextInfo class. TextInfo.ANSICodePage can be obtained by specifying System.Globalization.CultureInfo.CurrentCulture.TextInfo.ANSICodePage.

Mark Rushakoff
Interestingly, the assertion vanishes on the Asc() page when you move to the 3.5 Framework. The mind boggles.
David Schmitt
I am working with legacy code that I cannot modify, so it sounds like changing the current Code Page is the direction that I need to go. I know how to set CultureInfo and RegionInfo, but not how to set the current thread's Code Page. Can anyone show me how to do this, please?
Jim
+2  A: 

Chr(254) interprets the argument in a system dependent way, by looking at the System.Globalization.CultureInfo.CurrentCulture.TextInfo.ANSICodePage property. See the MSDN article about Chr. You can check whether that value is what you expect. "hu-US" (the hungarian locale as used in the US) might do something strange there.

As a side-note, Asc() has no promise about the used codepage in its current documentation (it was there until 3.0).

Generally I would stick to the unicode variants (ending on -W) if at all possible or use the Encoding class to explicitly specify the conversions.

David Schmitt
+1  A: 

My best guess is that your Windows tries to represent Chr(254)="ţ" as a combined letter, where the first letter is Chr(116)="t" and the second ("¸" or something like that) cannot be returned because Chr() only returns one letter.

Unicode text should not be handled character-by-character.

Lars D
Sounds probable. We'll really only know by looking at 'hu-US'' Codepage.
David Schmitt
This BLog explains exactly what you are saying."Best Fit in WideCharToMultiByte and System.Text.Encoding Should be Avoided" http://blogs.msdn.com/shawnste/archive/2006/01/19/515047.aspx
Jim