views:

239

answers:

7

I understand typecasting...but only in retrospect. My process to figure out what requires typecasting in expressions is usually retroactive because I can't predict when it will be required because I don't know how the compiler steps through them. A somewhat trite example:

int8_t x = -50;
uint16_t y = 50;
int32_t z = x * y;

On my 8-bit processor (Freescale HCS08) sets z to 63036 (2^16 - 50^2). I can see how that would be one possible answer (out of maybe 4 others), but I would not have guessed it would be the one.

A better way to ask might be: when types interact with operators (+-*/), what happens?

+3  A: 

The compiler is suppsed upcast to the largest type in the expression and then place the result into the size of the location. If you were to look at the assembler output of the above, you could see exactly how the types are being read in native format from memory. Upcastings from a smaller to a larger size is safe and won't generate warnings. It's when you go from a larger type into a smaller type that precision may be lost and the compiler is supposed to warn or error.

There are cases where you want the information to be lost though. Say you are working with a sin/cos lookup table that is 256 entries long. It's very convienent and common (at least in embedded land) to use a u8 value to access the table so that the index is wrapped naturally to the table size while preseving the circular nature of sin/cos. Then a typecast back into a u8 is required, but is exactly what you want.

Michael Dorgan
isn't 50 times -50 a negative number?
Gabriel
Maybe, depends if the place it is stored is signed or not. In this case the expression will assign into a signed value variable so it will be fine, though some compilers will warn about sign/unsigned math. The bits are the same signed or unsigned either way.
Michael Dorgan
This is such a simplification I referred to in my comment. What do you think `unsigned short s = 1; int i = s * -1;` is? And what `unsigned int s = 1; long long i = s * -1L;` (both on a 32 bit system)? It's not catched by "the compiler is supposed to upcast to the largest type in the expression.".
Johannes Schaub - litb
long long I've found is hacked badly in most compilers I work with. Still, I would expect both answers to be -1. Why would they be anything else?
Michael Dorgan
@Michael: for the second one, if int and long are the same size (as they would be on a typical 32bit system or an LLP64 system), then -1L will be converted to unsigned, and the result is UINT_MAX, not -1. If long and long long are the same size (as they would be on an LP64 system), then the result is -1.
Steve Jessop
+2  A: 

You will need type casting when you are down casting.

upcasting is auto and is safe, that is why the compiler never issues a warning/error. But when you are downcasting you are actually placing a value which has higher precision than the type of variable you are storing it in that is why the compiler wants you to be sure and you need to explicitly down cast.

Faisal Feroz
+1  A: 

When the compiler does implicit casting, it follows a standard set of arithmetic conversions. These are documented in the C standard in section 6.3. If you happen to own the K&R book, there is a good summary in appendix section A6.5.

bta
"arithmetic conversions"--the term I was looking for it appears. C99 6.3.1.9 http://web.archive.org/web/20050207010641/http://dev.unicals.com/papers/c99-draft.html#6.3.1.8 seems to zero in on what goes on.
Nick T
A: 

To explain what happens in your example, you've got a signed 8-bit type multiplied by an unsigned 16-bit type, and so the smaller signed type is promoted to the larger unsigned type. Once this value is created, it's assigned to the 32-bit type.

If you're just working with signed or unsigned integer types, it's pretty simple. The system can always convert a smaller integer type to a larger without loss of precision, so it will convert the smaller value to the larger type in an operation. In mixed floating-point and integer calculations, it will convert the integer to the floating-point type, perhaps losing some precision.

It appears you're being confused by mixing signed and unsigned types. The system will convert to the larger type. If that larger type is signed, and can hold all the values of the unsigned type in the operation, then the operation is done as signed, otherwise as unsigned. In general, the system prefers to interpret mixed mode as unsigned.

This can be the cause of confusion (it confused you, for example), and is why I'm not entirely fond of unsigned types and arithmetic in C. I'd advise sticking to signed types when practical, and not trying to control the type size as closely as you're doing.

David Thornley
I'd advise the other way around: never use signed types unless you know what you're doing and really need to. Unsigned has much cleaner and more predictable behavior.
R..
@R..: I disagree with you. The only real difference is in case of overflow, and while that's a difference in the Standard it doesn't usually show up in modern general-purpose computers. Signed arithmetic has a behavior that most programmers find more intuitive.
David Thornley
Actually, gcc makes use of the fact that signed overflow is undefined when optimizing, so it **does** make a difference in modern general-purpose computers.
R..
+1  A: 

If you want a complete answer, look at other people's suggestions. Read the C standard regarding implicit type conversion. And write test cases for your code...

It is interesting that you say this, because this code:

#include "stdio.h"
#include "stdint.h"

int main(int argc, char* argv[])
{
  int8_t x = -50;
  uint16_t y = 50;
  int32_t z = x * y;
  printf("%i\n", z);
  return 0;
}

Is giving me the answer -2500.

See: http://codepad.org/JbSR3x4s

This happens for me, both on Codepad.org, and Visual Studio 2010

Merlyn Morgan-Graham
This is on an embedded processor (HCS08). What it shows me over the serial terminal (`63036`), and in the memory with the debugger (`00 00 F6 3C`) agree, so I don't know if my compiler is "broken" (with respect to the standards) as Paul suggests, or just different.
Nick T
@Nick: your compiler is not "broken" unless you want to go so far as considering all compilers with 16-bit int "broken". :-)
R..
+2  A: 

The folks here that say that values are always converted to the larger type are wrong. We cannot talk about anything if we don't know your platform (I see you have provided some information now). Some examples

int = 32bits, uint16_t = unsigned short, int8_t = signed char

This results in value -2500 because both operands are converted to int, and the operation is carried out signed and the signed result is written to an int32_t.

int = 16bits, uint16_t = unsigned int, int8_t = signed char

This results in value 63036 because the int8_t operand is first converted to unsinged int, resulting in 65536-50. It is then multiplied with it, resulting in 3 274 300 % 65536 (unsigned is modulo arithmetic) which is 63036. That result is then written to int32_t.

Notice that the minimum int bit-size is 16 bits. So on your 8-bit platform, this second scenario is what likely happens.


I'm not going to try and explain the rules here because it doesn't make sense to me to repeat what is written in the Standard / Draft (which is freely available) in great detail and which is usually easily understandable.

Johannes Schaub - litb
+1  A: 

What happens to you, here, is integer promotion. Basically before computation takes place all types that are of a rank smaller than int are promoted to signed or unsigned, here to unsigned since one of your types is an unsigned type. The computation is than performed with that width and signedness and the result is finally assigned.

On your architecture unsigned is probably 16 bit wide, which corresponds to the value that you see. Then for the assignment the computed value fits in the target type which is even wider, so the value remains the same.

Jens Gustedt