



+3  Q: 

ASCII and printf

I have a little (big, dumb?) question about int and chars in C. I rememeber from my studies that "chars are little integers and viceversa," and that's okay to me. If I need to use small numbers, the best way is to use a char type.

But in a code like this:

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
  int i= atoi(argv[1]);
  printf("%d -> %c\n",i,i);
  return 0;

I can use as argument every number I want. So with 0-127 I obtain the expected results (the standard ASCII table) but even with bigger or negative numbers it seems to work...

Here is some example:

-181 -> K
-182 -> J
300 -> ,
301 -> -

Why? It seems to me that it's cycling around the ascii table, but I don't understand how.

+1  A: 

My guess is that %c takes the first byte of the value provided and formats that as a character. On a little-endian system such as a PC running Windows, that byte would represent the least-significant byte of any value passed in, so consecutive numbers would always be shown as different characters.

It's not that complicated due to varags promotion. (Though in C, chars are promoted to ints rather easily anyway.)
Roger Pate
+2  A: 

The %c format parameter interprets the corresponding value as a character, not as an integer. However, when you lie to printf and pass an int in what you tell it is a char, its internal manipulation of the value (to get a char back, as a char is normally passed as an int anyway, with varargs) happens to yield the values you see.

Roger Pate
So, it's just a result of the internal manipulation of the printf function? It's its "best effort" of converting an integer into a char, and if the integer is between [0-127] it produces a correct ascii value, but for every other value its behaviour is unpredictable. Am I right?
@Segolas: C99 § says it converts the int to an unsigned char and prints that. I've had enough standardese for today (and I've not even started work this morning), but I suspect that's identical to what you'd get from `(unsigned char)i`. Note that [ASCII](http://en.wikipedia.org/wiki/ASCII) is only defined for 0-127 (and is itself an implementation detail).
Roger Pate

What atoi does is converting the string to numerical values, so that "1234" gets 1234 and not just a sequence of the ordinal numbers of the string.


char *x = "1234";  // x[0] = 49, x[1] = 50, x[2] = 51, x[3] = 52 (see the ASCII table)
int y = atoi(x); // y = 1234
int z = (int)x[0];  // z = 49 which is not what one would want
atoi is incidental, the question is about how printf works.
Roger Pate

Edit: Please disregard this "answer".

Because you are on a little-endian machine :) Serously, this is an undefined behavior. Try changing the code to printf("%d -> %c, %c\n",i,i,'4'); and see what happens then...

Has nothing to do with a LE machine.
Roger Pate
@Roger Pate This is just a consequence of how varargs are commonly implemented. Honestly I don't see your reasoning, could you elaborate please?
@usta: C requires specific promotions for varargs, called "default argument promotions," rather than it being an implementation property. See C99 §, and, in this specific case, it might be clearer for you to read § and how p8 specifically says "int argument."
Roger Pate
@Roger Pate I'm sorry, you are right Roger, with vararg promotion this indeed has nothing to do with LE or BE. Please disregard my answer, it makes little sense to me now.

You told it the number is a char, so it's going to try every way it can to treat it as one, despite being far too big. Looking at what you got, since J and K are in that order, I'd say it's using the integer % 128 to make sure it fits in the legal range.


When we use the %c in printf statement, it can access only the first byte of the integer. Hence anything greater than 256 is treated as n % 256.

For example i/p = 321 yields op=A

+5  A: 

When you pass an int corresponding to the "%c" conversion specifier, the int is converted to an unsigned char and then written.

The values you pass are being converted to different values when they are outside the range of an unsigned (0 to UCHAR_MAX). The system you are working on probably has UCHAR_MAX == 255.

When converting an int to an unsigned char:

  • If the value is larger than UCHAR_MAX, (UCHAR_MAX+1) is subtracted from the value as many times as needed to bring it into the range 0 to UCHAR_MAX.
  • Likewise, if the value is less than zero, (UCHAR_MAX+1) is added to the value as many times as needed to bring it into the range 0 to UCHAR_MAX.


(unsigned char)-181 == (-181 + (255+1)) == 75 == 'K'
(unsigned char)-182 == (-182 + (255+1)) == 74 == 'J'
(unsigned char)300  == (300 - (255+1))  == 44 == ','
(unsigned char)301  == (301 - (255+1))  == 45 == '-'
D Krueger
Yes, I've tried to print UCHAR_MAX and is just 255 as you said. Thanks for your answer!
Is the effect the same as simply taking the lowest-order byte? Example: -1 = 0xFFFFFFFF, lowest order byte = FF = 255, -1 + 256 = 255. I suppose that's only true if UCHAR_MAX == 255. But still, I seriously doubt that the implementation in the infrastructure loops and adds or subtracts until if finds a value in range. It must use "%" at least... something better than a loop.
With two's complement and the size of a character is eight bits, it's the same as taking the low order byte, which I'm pretty sure most, if not all, compilers do.
D Krueger