views:

487

answers:

8

Is it safe to convert, say, from an unsigned char * to a signed char * (or just a char *?

+1  A: 

The conversion should be safe, as all you're doing is converting from one type of character to another, which should have the same size. Just be aware of what sort of data your code is expecting when you dereference the pointer, as the numeric ranges of the two data types are different. (i.e. if your number pointed by the pointer was originally positive as unsigned, it might become a negative number once the pointer is converted to a signed char* and you dereference it.)

futureelite7
yes, the range of unsigned char* and signed char* is different. I want to ask, what is the reason behind the conversion?
Shivan Raptor
Should just ask the guy asking the qn? Technically should be safe if the char* just points to data.
futureelite7
A: 

It depends on how you are going to use the pointer. You are just converting the pointer type.

O. Askari
A: 

You can safely convert an unsigned char* to a char * as the function you are calling will be expecting the behavior from a char pointer, but, if your char value goes over 127 then you will get a result that will not be what you expected, so just make certain that what you have in your unsigned array is valid for a signed array.

James Black
A: 

I've seen it go wrong in a few ways, converting to a signed char from an unsigned char.

One, if you're using it as an index to an array, that index could go negative.

Secondly, if inputted to a switch statement, it may result in a negative input which often is something the switch isn't expecting.

Third, it has different behavior on an arithmetic right shift

int x = ...;
char c = 128
unsigned char u = 128

c >> x;

has a different result than

u >> x;

Because the former is sign-extended and the latter isn't.

Fourth, a signed character causes underflow at a different point than an unsigned character.

So a common overflow check,

(c + x > c)

could return a different result than

(u + x > u)
Drew Hoskins
There's a differente between converting char to unsigned char and converting char* to unsigned char*. This question seems to be about about the latter.
sellibitze
Yes, but the question reduces to one about unsigned char vs char since the pointer doesn't really add any issues.
Drew Hoskins
A: 

Safe if you are dealing with only ASCII data.

Tanuj
+1  A: 

Casting changes the type, but does not affect the bit representation. Casting from unsigned char to signed char does not change the value at all, but it affects the meaning of the value.

Here is an example:

#include <stdio.h>
int main(int args, char** argv) {

  /* example 1 */
  unsigned char a_unsigned_char = 192;
  signed char b_signed_char = b_unsigned_char;
  printf("%d, %d\n", a_signed_char, a_unsigned_char); //192, -64

  /* example 2 */
  unsigned char b_unsigned_char = 32; 
  signed char a_signed_char = a_unsigned_char;
  printf("%d, %d\n", b_signed_char, b_unsigned_char); //32, 32

  return 0;
}

In the first example, you have an unsigned char with value 192, or 110000000 in binary. After the cast to signed char, the value is still 110000000, but that happens to be the 2s-complement representation of -64. Signed values are stored in 2s-complement representation.

In the second example, our unsigned initial value (32) is less than 128, so it seems unaffected by the cast. The binary representation is 00100000, which is still 32 in 2s-complement representation.

To "safely" cast from unsigned char to signed char, ensure the value is less than 128.

Joel
+2  A: 

The access is well-defined, you are allowed to access an object through a pointer to signed or unsigned type corresponding to the dynamic type of the object (3.10/15).

Additionally, signed char is guaranteed not to have any trap values and as such you can safely read through the signed char pointer no matter what the value of the original unsigned char object was.

You can, of course, expect that the values you read through one pointer will be different from the values you read through the other one.

Edit: regarding sellibitze's comment, this is what 3.9.1/1 says.

A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.9); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers.

So indeed it seems that signed char may have trap values. Nice catch!

avakar
+1 I was about to write something similar.
sellibitze
about trap representations: I think this guarantee is only for unsigned char. At least this is what Jack Klein wrote here: http://home.att.net/~jackklein/c/inttypes.html . Can you point out the section in the standard where this is mentioned?
sellibitze
The Standard allows reading any POD object (including unsigned char) using an `char` lvalue too. More - it guarantees that when you write those `char` values back into the same `POD` object, you receive the original value. I have a hard time imaging how this could work with `char` having trap representations. Any ideas?
Johannes Schaub - litb
I don't think any implementation has signed characters with trap representations. But I don't see where this is ruled out int the standard. std::memcpy takes void* and an implementation might use unsigned char* internally to copy the bytes. The problem I see with signed char is that they may have two representations for the same value, zero (one's complement and sign+magnitude). Even if every bitpattern represented a valid signed char value, you'd have the ambiguity with zero. Does an assignment x=y; retain the bitrepresentation or just the value it represents? I don't know...
sellibitze
On implementation where `char` has the same representation as `unsigned char`, `signed char` may have trap values. That's how I see it.
avakar
Ah i think i had a wrong assumption: The Standard doesn't seem to say `char` may have the same representation as `signed char` - so places where stuff is unconditionally allowed for `char` doesn't necessarilly mean something is implied about the representation of `signed char`. It is only said that `char` might be able to store the same values as `signed char`, which of course doesn't preclude trap representations for `signed char`. Deleted that comment.
Johannes Schaub - litb
A: 

I'm astonished it hasn't been mentioned yet: Boost numeric cast should do the trick - but only for the data of course.

Pointers are always pointers. By casting them to a different type, you only change the way the compiler interprets the data pointed to.

Tobias Langner