views:

216

answers:

6

I have created an MFC application from scratch being careful from the start to use Unicode aware structures such as CStringW, LPCWSTR Etc. to store and process data. Unicode is also defined in the project.

Since I only one speak one language I tried the following test to ensure that a Unicode string was processed and stored correctly by the application.

In one of the Edit Boxes I entered ALT + 2061 and ALT + 2066 to display symbols not available on my keyboard link text The only thing displayed on the editbox is a square. Tryed the same think in Notepad and the symbols were correctly displayed. Is this just a font issue? If so what font should I be using?

UPDATE:

I copied several symbols from the unibook and pasted them into the Edit boxes. Apart from a small few symbols they were processed and saved correctly so I am happy with that.

Happy Christmas.

A: 

Yes. This is a font issue.

I would not worry about this, because the fonts installed on your customer's machine will be different than yours. If he needs kanji, he will have it. You can though force a specific font. Then you will see it correctly on your Windows.

I also recommend you against doing unicode with W (widechar) stuff. See http://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful

Pavel Radzivilovsky
Thanks. Is there an easy to test the app does indeed process Unicode data correctly? I would prefer to get it right before an end user discovers a problem but maybe I am being overcautious?Thanks also for the link re unicode. The data in this app is converted to UTF8 before it is stored.
Canacourse
Wrong - in general, if Notepad displays the character correctly, it's not a font issue, because Windows will try to substitute a suitable font in order to display the character.
oefe
The answers to the question linked largely agree that each application should use whatever encoding works best for its circumstance, not to avoid UTF-16. UTF-8, which is also a variable-length encoding, is pretty poor for storing lots of non-Western text or symbols.
brianary
Brianary (Great Name): Some of the tools used to process the data exported from this application (to a file) only worked if the file was UTF8 encoded. Do you mean poor from a storage point of view?
Canacourse
+1  A: 

U+2061 is "FUNCTION APPLICATION", which is a special nonprinting "operator" character, U+2066 is not defined yet (as of Unicode 5.2). Thus, what you see in your application is correct; probably you entered different codes in Notepad?

oefe
Jut tried again. ALT + 2061 fives a music notational symbol - ALT + 2066 give a double head vertical arrow.
Canacourse
I see the same behavior. Music and double headed vertical arrow.
Nate Bross
Those keystrokes don't map directly to the corresponding Unicode code points. ALT+2661 = U+266A, ALT+2066=U+2195.
Mark Ransom
+2  A: 

Make sure that the Alt-key input method you are describing works in other programs. It doesn't seem to work on my WinXP system.

You may just want to download Unibook from unicode.org and copy the chars from that just to be sure.

brianary
+2  A: 

It is not a directly a font issue, it's an issue with your ALT+number method.

Using the alt key and the numberpad it's possible to enter any ASCII code, even those that don't have a key on the keyboard. It is not possible to enter unicode characters in this manner - everytime you enter a 4 digit code, it is just getting wrapped around to an 8 bit ASCII code:

2061 -> 100000001101 -> 00001101 -> ASCII character 13, a musical note

The reason it shows up as a square is because character 13 is a control character, so most modern fonts have no visual representation.

If you want to enter unicode characters using the numberpad then you'll need to use the extended mode Microsoft added - press and hold alt, then press and hold the plus (+) key on the numberpad. Now type the number of the unicode character you want and then release the alt and + key.

David
I got the ALT + method here http://www.fileformat.info/tip/microsoft/enter_unicode.htm Assumed it was correct.
Canacourse
Just reread your answer. I was not using ALT + Number. I was using the method you suggested.
Canacourse
As in your link, a registry setting may be needed to enable the plus key unicode entry mode.
David
A: 

Go to Spy++ and look at the window class. Is it Edit, or is it some variant of RICHEDIT? A RichEdit window requires you to specify the font, which you can do with CRichEditCtrl::SetDefaultCharFormat.

Mark Ransom
A: 

I suggest you run the function IsWindowUnicode() somewhere in your app on the edit control. If it returns TRUE, then you know for sure that your app and your controls are unicode aware (and then it most likely is a font issue). However if it returns false, you have to look in your code and your project settings to find out why your app/window is not unicode aware.

Stefan