views:

528

answers:

3

I've made a program in MVSC++ which outputs memory contents (in ASCII). The ASCII I see in windows console seem to match what I see in various ASCII tables (smiley, diamond, club, right arrow etc). This program needs to compile under Linux (which is does), but the ASCII output looks completely different. A few symbols are the same but the rest are so different. Is there any way to change how terminal displays ASCII code?

EDIT: The program executes correctly, it's just the ASCII that is being displayed differently.

+3  A: 

ASCII defines character codes from 0x00 through 0x7f. Everything else (0x80-0xff) is not part of the ASCII standard and depends on what the operating system defines as the characters to display. However, the characters you mention (smiley, diamond, club, etc) are the representations of the ASCII "control characters" that don't normally have a visual representation. Windows lets you print such characters and see the glyphs it has defined for them, but your Linux is probably interpreting the control characters as formatting control codes (which they are) instead of printing corresponding glyphs.

Greg Hewgill
@Adam: that question relates to UTF-8 encoding, which is a whole other topic and nicely covered by Joel's article on the topic: http://www.joelonsoftware.com/articles/Unicode.html
Greg Hewgill
@Greg: thanks, you're absolutely right. I'll delete my first comment lest others become as confused as I was.
Adam Bernier
Thanks for clearing that up!
Mikey D
@Greg, that is *so* wrong. ASCII defines 0x00 thru 0x7f, including the control characters below 0x20. Not downvoting (yet :-), I'll give you a chance to fix it (or argue your case sufficiently).
paxdiablo
@Pax: Thanks, you're quite right, my terminology was inaccurate. I've corrected the answer.
Greg Hewgill
Yeah, I wasn't actually sure whether you meant ASCII defines what graphemes are printed only for the printable range (which would be an arguable case I would have accepted). +1.
paxdiablo
+1  A: 

What you are seeing is the "extended" character set that IBM initially included when PCs were first unleashed upon the world. Yes, we are going back to the age of mighty dinosaurs, so bear with me. These characters live above $7F and the interpretation of their symbols on the screen can even be influenced by the font chosen. Most linux distros are now using UTF-8 (or something close) and as such, the fonts installed may have completely different symbols, or even missing glyphs. In cases where you are comparing "ASCII" representations (which is a misnomer, as it's not really true ASCII) of the same data, it may or may not exactly match, as you must have the same "glyph" renderings in both display fonts to correctly see similar representations. Try getting both your Windows and Linux installs to use the same font if possible, and then see if there is a change.

Avery Payne
A: 

If your browser supports Unicode (and you have the correct fonts installed), you will see them bellow. You can copy and paste into an editor with unicode support(Notepad). Save as UTF-16BE Then if you open in a HexEditor you will see all the unicode codes for each char visible glyph. In example the first ascii char Null has Unicode visible glyph 0x2639 in c\c++\java you can use it like \u2639. Its not a null char but the visual representation.

http://en.wikipedia.org/wiki/Code_page_437

☹☺☻♥♦♣♠•◘○◙♂♀♪♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~⌂ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜ¢£¥₧ƒáíóúñѪº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛┐└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╫╪┘┌█▄▌▐▀αßΓπΣσµτΦΘΩδ∞φε∩≡±≥≤⌠⌡÷≈°∙·√ⁿ²■⓿

cgarcia109