ansaurus

Question

Can "signed char" and "unsigned char" always be cast to each other without loss of data?

Answer 1

A:

My first guess would be "maybe." Have you tried testing this with various inputs?

AlvinfromDiaspar 2010-07-19 07:48:48

Just testing won't help – that can only tell you whether your test machine/compiler supports it. You can't count on it unless it's specified behavior. Otherwise, you'll have endless portability headaches later.

Thom Smith 2010-07-19 07:59:13

So i guess the correct answer to this question is, "it depends." ?

AlvinfromDiaspar 2010-07-19 08:10:09

The button when you posted was *"Post Your Answer"*, not *"Post Your Guess"*, wasn't it?

Georg Fritzsche 2010-07-19 08:55:47

Answer 2

A:

AFAIK, this cast will never alter the byte, just change its representation.

alxx 2010-07-19 07:52:21

Answer 3

+7 A:

No, there's no such guarantee. The conversion from signed char to unsigned char is well-defined, as all signed-to-unsigned integral conversions in C++ (and C) are. However, the result of that conversion can easily turn out to be outside the bounds of the original signed type (will happen in your example with -10).

The result of the reverse conversion - unsigned char to signed char - in that case is implementation-defined, as all overflowing unsigned-to-signed integral conversions in C++ (and C) are. This means that the result cannot be predicted from the language rules alone.

Normally, you should expect the implementation to "define" it so that the original signed char value is restored. But the language makes no guarantees about that.

AndreyT 2010-07-19 07:58:02

When negative integer is cast into signed format, it just turns into its complement code (some big number), isn't it? This may be called data loss, but you can cast it back into negative.

alxx 2010-07-19 08:01:36

@alxx: Er... Did you mean "cast into *unsigned* format"? Cast from signed to unsigned are required to produce the "modulo" value. That's how the language specification requires it. Whether it happens "by itself" (as in 2's complement machines) or because the compiler takes steps to ensure it is a different story. Casting back to signed... Yes you "can" do it, but again, the language makes no guarantees about the result.

AndreyT 2010-07-19 08:07:25

casting signed char -1 to unsigned char yields 255 on my compiler so you could say that there is data incompatibility.. I am hesitant to call it data loss seeing as you are not losing anything really just meaning

0A0D 2010-07-19 20:27:10

@0A0D: that conversion is well-defined, and doesn't lose data. The other conversion (unsigned to signed) is implementation-defined, so we can't know whether or not some implementations might lose data, even if yours doesn't.

Mike Seymour 2010-07-19 20:40:18

@Mike: I agree. Whether there is data loss, is anyone's guess. I'd rather use the wording data incompatibility because loss to me means 2 byte versus 1 byte for example

0A0D 2010-07-19 21:54:15

@0A0D: What happens "on your computer" could easily be specific to your computer. At least the language standard says that it formally is. Which is the whole point.

AndreyT 2010-07-19 21:58:29

@AndreyT: It's compiler-specific (ok implementation specific) which the standard clearly says.

0A0D 2010-07-19 23:28:55

Answer 4

A:

I guess the meaning of your question is what is key. When you say loss, you mean that you are losing bytes or something like that. You are not losing anything as such since the size of both are the same, they just have different ranges.

signed char and unsigned char are not guaranteed to be equal. When most people think unsigned char, they are thinking from 0 to 255.

On most implementations (I have to caveat because there is a difference), signed char and unsigned char are 1 byte or 8 bits. signed char is typically from -128 to +127 whereas unsigned char is from 0 to +255.

As far as conversions, it is left up to different implementations to come up with an answer. On a whole, I wouldn't recommend you converting between the two. To me, it makes sense that it should give you the POSITIVE equivalent if the value is negative and remain the same if is positive. For instance in Borland C++ Builder 5, given a signed char test = -1 and you cast it into unsigned char, the result will be 255. Alternatively, the result is different if all values are positive.

But as far as comparisons, while the values may appear the same, they probably won't be evaluated as equal. This is a major trip up when programmers sometimes compare signed and unsigned values and wonder why the data all looks the same, but the condition will not work properly. A good compiler should warn you about this.

I'm of the opinion that there should be an implicit conversion between the signed and unsigned so that if you cast from one to the other, the compiler will take care of the conversion for you. It is up to the compiler's implementation on whether you lose the original meaning. Unfortunately there is no guarantee that it will always work.

Finally, from the standard, there should exist a plain conversion between signed char or unsigned char to char. But whichever it chooses to take, is implementation defined

3.9.1 Fundamental types [basic.fundamental]

1 Objects declared as characters char) shall be large enough to store any member of the implementation's basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (basic.types); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined.

0A0D 2010-07-19 20:02:42

ansaurus

tags:

views:

answers:

Can "signed char" and "unsigned char" always be cast to each other without loss of data?

related questions