tags:

views:

11062

answers:

13
+17  Q: 

unsigned char

In C/C++, what is an unsigned char used for? How is this different from a regular char?

+23  A: 

This is implementation dependent, as the C standard does NOT define the signed-ness of "char". Depending on the platform, char may be signed or unsigned, so you need to explicitly ask for "signed char" or "unsigned char" if your implementation depends on it. Just use "char" if you intend to represent characters from strings, as this will match what your platform puts in the string.

The difference between signed char and unsigned char is as you'd expect. On most platforms, signed char will be an 8-bit two's complement number ranging from -128 to 127, and unsigned char will be an 8-bit unsigned integer (0 to 255). Note the standard does NOT require that char types have 8 bits, only that sizeof(char) return 1. You can get at the number of bits in a char with CHAR_BIT in limits.h. There are few if any platforms today where this will be something other than 8, though.

There is a nice summary of this issue here:

http://www.arm.linux.org.uk/docs/faqs/signedchar.php

As others have mentioned since I posted this, you're better off using int8_t and uint8_t if you really want to represent small integers.

tgamblin
A: 

signed char has range -128 to 127 unsigned char has range 0 to 255

char will be equivalent to either signed char or unsigned char, depending on the compiler, but is a distinct type.

If you're using C-style strings, just use char. If you need to use chars for arithmetic (pretty rare), specify signed or unsigned explicitly for portability.

James Hopkin
+3  A: 

If you want to use a character as a small integer, the safest way to do it is with the int8_t and uint8_t types.

ajbl
A: 

Some googling found this, where people had a discussion about this.

An unsigned char is basically a single byte. So, you would use this if you need one byte of data (for example, maybe you want to use it to set flags on and off to be passed to a function, as is often done in the Windows API).

dbrien
A: 

An unsigned char is a (unsigned) byte value (0 to 255). You may be thinking of "char" in terms of being a "character" but it is really a numerical value. The regular "char" is signed, so you have 128 values, and these values map to characters using ASCII encoding. But in either case, what you are storing in memory is a byte value.

Zac
+1  A: 

In terms of direct values a regular char is used when the values are known to be between CHAR_MIN and CHAR_MAX while an unsigned char provides double the range on the positive end. For example, if CHAR_BIT is 8, the range of regular char is only guaranteed to be [0, 127] (because it can be signed or unsigned) while unsigned char will be [0, 255] and signed char will be [-127, 127].

In terms of what it's used for, the standards allow objects of POD (plain old data) to be directly converted to an array of unsigned char. This allows you to examine the representation and bit patterns of the object. The same guarantee of safe type punning doesn't exist for char or signed char.

Julienne Walker
+1  A: 

If you like using various types of specific length and signedness, you're probably better off with uint8_t, int8_t, uint16_t, etc simply because they do exactly what they say.

Dark Shikari
A: 

An unsigned char uses the bit that is reserved for the sign of a regular char as another number. This changes the range to [0 - 255] as opposed to [-128 - 127].

Generally unsigned chars are used when you don't want a sign. This will make a difference when doing things like shifting bits (shift extends the sign) and other things when dealing with a char as a byte rather than using it as a number.

A: 

unsigned char is the heart of all bit trickery. In almost ALL compiler for ALL platform an unsigned char is simply a BYTE. An unsigned integer of (usually) 8 bits. that can be treated as a small integer or a pack of bits.

In addiction, as someone else has said, the standard doesn't define the sign of a char. so you have 3 distinct "char" types: char, signed char, unsigned char.

ugasoft
+2  A: 

As for example usages of unsigned char:

unsigend char is often used in computer graphics which very often (though not always) assigns a single byte to each colour component. It is common to see an RGB (or RGBA) colour represented as 24 (or 32) bits, each an unsigned char. Since unsigned char values fall in the range [0,255], the values are typically interpreted as

  • 0 meaning a total lack of a given colour component
  • 255 meaning 100% of a given colour pigment

So you would end up with RGB red as (255,0,0) -> (100% red, 0% green, 0% blue).

Why not use a signed char? Arithmetic and bit shifting becomes problematic. As explained already, a signed char's range is essentially shifted by -128. A very simple and naive (mostly unused) method for converting RGB to grayscale is to average all three colour components, but this runs into problems when the values of the colour components are negative. Red (255, 0, 0) averages to (85, 85, 85) when using unsigned char arithmetic. However, if the values were signed chars (127,-128,-128), we would end up with (-99, -99, -99), which would be (29, 29, 29) in our unsigned char space, which is incorrect.

Zachary Garrett
A: 
bk1e
+23  A: 

In C++, there are three distinct character types:

  • char
  • signed char
  • unsigned char

If you are using character types for text, use the unqualified char:

  • it is the type of character literals like 'a' or '0'.
  • it is the type that makes up C strings like "abcde"

It also works out as a number value, but it is undefined whether that value is treated as signed or unsigned. Beware character comparisons through inequalities - although if you limit yourself to ASCII (0-127) you're just about safe.

If you are using character types as numbers, use:

  • signed char, which gives you at least the -128 to 127 range.
  • unsigned char, which gives you at least the 0 to 255 range.

"At least", because only the C++ standard only gives the minimum range of values that each numeric type is required to cover. Your compiler could very well have a 32-bit character type... and sizeof would still be report its size as 1 - meaning that you could have sizeof (char) == sizeof (long) == 1.

Fruny
Very nice summary.
Michael Burr
To be clear, could you have 32-bit chars, and 32-bit integers, and have sizeof(int) != sizeof(char)? I know the standard says sizeof(char) == 1, but is the relative sizeof(int) based on actual difference in size or the difference in range?
Joseph Garvin
Joseph, the sizeof gives you the size of the object representation of the type. if you say 32bit int, that first doesn't tell much. most probably you mean the object representation (that's the physical size - including all padding bits).
Johannes Schaub - litb
if that's the case, then sizeof(int) != sizeof(char) can't be true, because char/unsigned/signed char use all bits of their object representation to represent their values (called the value representation)
Johannes Schaub - litb
+4  A: 

Because i feel it's really called for, i just want to state some rules of C and C++ (they are the same in this regard). First, all bits of unsigned char participate in determining the value if any unsigned char object. Second, unsigned char is explicitly stated unsigned.

Now, i had a discussion with someone about what happens when you convert the value -1 of type int to unsigned char. He refused the idea that the resulting unsigned char has all its bits set to 1, because he was worried about sign representation. But he don't have to. It's immediately following out of this rule that the conversion does what is intended:

If the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. (6.3.1.3p2 in a C99 draft)

That's a mathematical description. C++ describes it in terms of modulo calculus, which yields to the same rule. Anyway, what is not guaranteed is that all bits in the integer -1 are one before the conversion. So, what do we have so we can claim that the resulting unsigned char has all its CHAR_BIT bits turned to 1?

  1. All bits participate in determining its value - that is, no padding bits occur in the object.
  2. Adding only one time UCHAR_MAX+1 to -1 will yield a value in range, namely UCHAR_MAX

That's enough, actually! So whenever you want to have an unsigned char having all its bits one, you do

unsigned char c = (unsigned char)-1;

It also follows that a conversion is not just truncating higher order bits. The fortunate event for two's complement is that it is just a truncation there, but the same isn't necessarily true for other sign representations.

Johannes Schaub - litb
+1, incredibly clear, thank you... so that's how you can be guaranteed in a statement like unsigned char x = 1 - 25 that the result will be 231... you made my day!
sheepsimulator