tags:

views:

281

answers:

7

In ASCII, i wonder how is 65 translated to 'A' character? As far as my knowledge goes, 65 can be represented in binary but 'A' is not. So how could this conversion happen?

+1  A: 

Everything in a computer is stored as a number. It's how software interprets those numbers that's important.

ASCII is a standard that maps the number 65 to the letter 'A'. They could have chosen 66 or 14 to represent 'A', but they didn't. It's almost arbitrary.

So if you have the number 65 sitting in computer memory somewhere, a piece of code that treats that piece of memory as ASCII will map the 65 to 'A'. Another piece of code that treats that memory as an entirely different format may translate it to something else entirely.

HTH,
Kent

Kent Boogaart
+6  A: 

'A' IS 65. It's just that your display device knows that it should display the value 65 as an A when it renders that value as a character.

nos
I think the question is confusing, because it's not entirely sure what is asked, but *my bet* is that this answer nails it.
Joachim Sauer
What is a string? A miserable little pile of numbers! But enough talk, let's run through that pile, look up the characters for each number and write 'em!
Kawa
+6  A: 

It is just a 'definition'. ASCII defines the relationships between integer values and characters. For implementation, there is a table (you can't see it) that does this translation.

EDIT: Computers just 0/1. A stream of characters is just a bunch of 0/1 streams: 0110010101... There is a contract between human and computer: 8 bits are represented as a character (okay, there are Unicode, UTF-8 and etc). And, 'A' is 65 and so on.

In C/C++ and any other languages, strings are just handled like integer arrays. Only when you need to display strings, that numbers are 'translated' into character. This translation is done by either hardware or software:

  • If you write a function that draws character, you're responsible to draw 'A' when the input is 65.
  • In the past, say that we're in DOS, the computer draws 'A' on the number 65. That relationship is usually stored in the memory. (At that time where no graphics, only text, this table can be tweaked to extend characters. I remember Norton DOS utilities such as NDD/NCD changed this table to draw some special characters that were not in the regular ASCII code.)

You may see this sort of contract or definition in everywhere. For example, assembly code. Your program will be eventually translated into machine code: that is also just a bunch of 0 and 1. But, it is extremely hard to understand when only 0 and 1 are shown. So, there is a rule: say 101010 means "add", 1100 means "mov". That's why we can program like "add eax, 1", and it'll be ultimately decoded into 0/1s.

minjang
Is that definition table implemented in hardware or software? (ex: in c language)
tsubasa
It's called "character encoding". Wiki: http://en.wikipedia.org/wiki/Character_encoding
BalusC
In practice, in a font that has the mathematical curves defining the outline of the 'A' glyph in slot number 65.
Michael Petrotta
For an easy to use online ASCII table, see http://www.asciitable.com/
Sebastiaan Megens
the table can be either like i described (pixels on + off) or like michael described (curves) depending on what kind of font it is. it can also be in hardware or software, depending on the particular computer. these days the trend is towards software and curves rather than bitmaps (pixels) (where by "software" i mean that the data can be changed and reside in "normal" memory, rather than in some dedicated display device).
andrew cooke
@Sebastian, or if you use UNIX try man ascii.
Fred
+3  A: 

The ASCII table is just an agreed upon map of values and characters.

When the computer is instructed to write a character represented by a number to the screen it just finds the numbers corresponding image. The image doesn't make any sense to the computer, it could be an image that looks like an 'A' or a snowman to the user.

Niels Castle
A: 

Its based on a lookup table invented back in the 60's.

Mark Redman
+10  A: 

Everything in a computer is binary. So a string in C is a sequence of binary values. Obviously that is not much use to humans, so various standards developed, where people decided what numerical values would represent certain letters. In ASCII the value 65 represents the letter A. So the value stored is 65, but everyone knows (because they have read the ASCII spec) that value corresponds to the letter A.

For example, if I am writing the code to display text on the screen, and I receive the value 65, I know to set certain pixels and delete other pixels, so that pixels are arranged like:

  @
 @ @
@@@@@
@   @
@   @

At no point does my code "really know" that is an "A". It just knows that 65 is displayed as that pattern. Because, as you say, you cannot store letters directly, only binary numbers.

andrew cooke
Good explanation! Maybe you should bridge the binary-to-decimal gap as well, since you claim "everything is binary" and then claim that the computer handles 65 ;-)
Joachim Sauer
Good formulation, I think that addresses the OP's question much more accurately than “It's based on a mapping that says so.” I started writing something to that effect, but you spared me the trouble :-)
Arthur Reutenauer
@Joachim: I think it's pretty clear that “65” here stands for the numerical value, not its decimal representation.
Arthur Reutenauer
Sure, in fact the code for implementing what I describe above will probably have a table that associates 01000001 (65) with 00000100, 00001010, 00011111, 00010001, 00010001 which describes the pixel pattern...
andrew cooke
+1: I like the way you explained it.
BalusC
+1  A: 

So how could this conversion happen?

This conversion is merely called character encoding. The computer only understands bytes and humans (on average =) ) only understands characters. The computer has roughly said a mapping of all bytes and all characters which belongs to those bytes so that it can present the data in a human friendly manner. It's all software based (thus not hardware based). The operating system is usually the one who takes care about this.

ASCII is one of the oldest character encodings. Nowadays we should be all on UTF-8 to avoid Mojibake.

BalusC