ansaurus

Question

ASCII and printf

Answer 1

+1 A:

My guess is that %c takes the first byte of the value provided and formats that as a character. On a little-endian system such as a PC running Windows, that byte would represent the least-significant byte of any value passed in, so consecutive numbers would always be shown as different characters.

BlueMonkMN 2010-10-07 11:05:58

It's not that complicated due to varags promotion. (Though in C, chars are promoted to ints rather easily anyway.)

Roger Pate 2010-10-07 11:07:11

Answer 2

+2 A:

The %c format parameter interprets the corresponding value as a character, not as an integer. However, when you lie to printf and pass an int in what you tell it is a char, its internal manipulation of the value (to get a char back, as a char is normally passed as an int anyway, with varargs) happens to yield the values you see.

Roger Pate 2010-10-07 11:06:10

So, it's just a result of the internal manipulation of the printf function? It's its "best effort" of converting an integer into a char, and if the integer is between [0-127] it produces a correct ascii value, but for every other value its behaviour is unpredictable. Am I right?

Segolas 2010-10-07 12:20:46

@Segolas: C99 §7.19.6.1p8 says it converts the int to an unsigned char and prints that. I've had enough standardese for today (and I've not even started work this morning), but I suspect that's identical to what you'd get from `(unsigned char)i`. Note that [ASCII](http://en.wikipedia.org/wiki/ASCII) is only defined for 0-127 (and is itself an implementation detail).

Roger Pate 2010-10-07 12:26:24

Answer 3

A:

What atoi does is converting the string to numerical values, so that "1234" gets 1234 and not just a sequence of the ordinal numbers of the string.

Example:

char *x = "1234";  // x[0] = 49, x[1] = 50, x[2] = 51, x[3] = 52 (see the ASCII table)
int y = atoi(x); // y = 1234
int z = (int)x[0];  // z = 49 which is not what one would want

dark_charlie 2010-10-07 11:08:43

atoi is incidental, the question is about how printf works.

Roger Pate 2010-10-07 11:11:42

Answer 4

A:

Edit: Please disregard this "answer".

Because you are on a little-endian machine :) Serously, this is an undefined behavior. Try changing the code to printf("%d -> %c, %c\n",i,i,'4'); and see what happens then...

usta 2010-10-07 11:09:30

Has nothing to do with a LE machine.

Roger Pate 2010-10-07 11:11:08

@Roger Pate This is just a consequence of how varargs are commonly implemented. Honestly I don't see your reasoning, could you elaborate please?

usta 2010-10-07 11:17:09

@usta: C requires specific promotions for varargs, called "default argument promotions," rather than it being an implementation property. See C99 §6.5.2.2p6, and, in this specific case, it might be clearer for you to read §7.19.6.1 and how p8 specifically says "int argument."

Roger Pate 2010-10-07 11:43:03

@Roger Pate I'm sorry, you are right Roger, with vararg promotion this indeed has nothing to do with LE or BE. Please disregard my answer, it makes little sense to me now.

usta 2010-10-07 11:48:15

Answer 5

A:

You told it the number is a char, so it's going to try every way it can to treat it as one, despite being far too big. Looking at what you got, since J and K are in that order, I'd say it's using the integer % 128 to make sure it fits in the legal range.

AaronM 2010-10-07 11:09:41

Answer 6

A:

When we use the %c in printf statement, it can access only the first byte of the integer. Hence anything greater than 256 is treated as n % 256.

For example i/p = 321 yields op=A

Elcid 2010-10-07 12:18:34

Answer 7

+5 A:

When you pass an int corresponding to the "%c" conversion specifier, the int is converted to an unsigned char and then written.

The values you pass are being converted to different values when they are outside the range of an unsigned (0 to UCHAR_MAX). The system you are working on probably has UCHAR_MAX == 255.

When converting an int to an unsigned char:

If the value is larger than UCHAR_MAX, (UCHAR_MAX+1) is subtracted from the value as many times as needed to bring it into the range 0 to UCHAR_MAX.
Likewise, if the value is less than zero, (UCHAR_MAX+1) is added to the value as many times as needed to bring it into the range 0 to UCHAR_MAX.

Therefore:

(unsigned char)-181 == (-181 + (255+1)) == 75 == 'K'
(unsigned char)-182 == (-182 + (255+1)) == 74 == 'J'
(unsigned char)300  == (300 - (255+1))  == 44 == ','
(unsigned char)301  == (301 - (255+1))  == 45 == '-'

D Krueger 2010-10-07 12:29:28

Yes, I've tried to print UCHAR_MAX and is just 255 as you said. Thanks for your answer!

Segolas 2010-10-07 13:02:44

Is the effect the same as simply taking the lowest-order byte? Example: -1 = 0xFFFFFFFF, lowest order byte = FF = 255, -1 + 256 = 255. I suppose that's only true if UCHAR_MAX == 255. But still, I seriously doubt that the implementation in the infrastructure loops and adds or subtracts until if finds a value in range. It must use "%" at least... something better than a loop.

BlueMonkMN 2010-10-08 13:36:08

With two's complement and the size of a character is eight bits, it's the same as taking the low order byte, which I'm pretty sure most, if not all, compilers do.

D Krueger 2010-10-09 03:25:24

ansaurus

tags:

views:

answers:

ASCII and printf

related questions