views:

73

answers:

3

Hi, I read wikipedia but I do not understand whether extended ASCII is still just ASCII and is available on any computer that would run my console application? Also if I understand it correctly, I can write an ASCII char only by using its unicode code in VB or C#. Thank you

A: 

As Wikipedia says, ASCII is only 0-127. "Extended ASCII" is a misnomer, should be avoided, and used to loosely mean "some other character set based on ASCII which only uses single bytes" (meaning not multibyte like UTF-8). Sometimes the term means the 128-255 codepoints of that specific character set⁠—⁠but again, it's vague and you shouldn't count on it meaning anything specific.

The use of the term is sometimes criticized, because it can be mistakenly interpreted that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, both of which are untrue.

Source: http://en.wikipedia.org/wiki/Extended_ASCII

Roger Pate
+3  A: 

ASCII only covers the characters with value 0-127, and those are the same on all computers. (Well, almost, although this is mostly a matter of glyphs rather than semantics.)

Extended ASCII is a term for various single-byte code pages that are assign various characters to the range 128-255. There is no single "extended ASCII" set of characters.

In C# and VB.NET, all strings are Unicode, so by default, there's no need to worry about this - whether or not a character can be displated in a console app is a matter of the fonts being used, not the limitation of any specific single-byte codepage.

Michael Madsen
You don't write software that runs on EBCDIC systems?! :P
Roger Pate
Congrats on 10k!
Roger Pate
@Roger: No, no I don't. And I don't think the OP will do that either :) (Also, thanks.)
Michael Madsen
Thanks, also if I use only the first 127, I can be sure they will be displayed well, right?
Mojmi
@Mojmir: Assuming you don't use any non-printable characters, and we ignore the issue about the glyph used for a backslash on a Japanese or Korean system, then yes.
Michael Madsen
Come on everyone, let's get him down below 10k again so he doesn't get too cocky. :)
bzlm
@bzlm: He'd need posts worth downvoting for that. When I saw he was just shy of 10k I looked through his answers, and didn't see any meriting that (but I was looking for ones worth upvoting instead ;).
Roger Pate
@Michael Madsen, and when using non printable chars? I just do not know what is bad about it. If I make simple app and wants to display some of the ASCII standard non printable chars.
Mojmi
@Mojmir It's hard to understand what your question is. Are you asking whether what you see in your console output will work for any user anywhere regardless of which ASCII character you use?
bzlm
Well, they're *non-printable*, therefore, you can't really count on anything sensible happening if you try to print them. If you're thinking of the glyphs you could usually show for those in the old DOS days, there are equivalent Unicode characters for those, but depending on the console font, you may not be able to display all of them - you'll have to try for yourself. See [Wikipedia's page on code page 850](http://en.wikipedia.org/wiki/Code_page_850) for an example of this mapping.
Michael Madsen
@bzlm, yes, basically. If I use only the first 127 chars - uc0002 etc.
Mojmi
@ Michael Madsen Thanks. So, if I am able to display say this uc0002 (smile) in the console app, then (as C# .NET uses Unicode) everyone will. I was only confused whether this char is ascii or not
Mojmi
@Mojmir: No, Unicode character 0002 is not a smile. Look at the hexadecimal number below it; *that's* the Unicode value of that character.
Michael Madsen
@Michael Madsen Well, so what is the reason \u0002 prints that smile? I do not get it :(
Mojmi
@Mojmir: Because that's how those really old code pages were defined, so the glyphs are in that particular font for legacy reasons - it's not something you should depend on, because it's officially a [control character with no formally defined glyph](http://www.fileformat.info/info/unicode/char/0002/index.htm). Use the proper smile glyph from Unicode if you really need it.
Michael Madsen
Ok, thanks. That was what I needed to know.
Mojmi
A: 

As others have said, true ASCII is always the lower 7 bits of each byte. Before the advent (and ubiquity) of Unicode standards, various extensions to the ASCII character set that utilized the eighth bit were released. The most common in the Windows world is Windows code page 1252.

If you're looking to use this encoding in .NET, you can get it like this:

Encoding windows1252 = Encoding.GetEncoding("windows-1252");
Adam Robinson