views:

457

answers:

5

Lets say I have 4Byte integer and I want to cast it to 2Byte short integer. Am I right that in both (little and big endian) short integer will consist of 2 least significant bytes of this 4Byte integer?

Second question:
What will be the result of such code in little endian and big endian processor?

int i = some_number;  
short s = *(short*)&i;

IMHO in big endian processor 2 most significant bytes would be copied, and in little endian 2 least significant bytes would be copied.

+1  A: 
  1. Yes. When you convert values, you don't have to worry about endianness.

  2. Yes. When you convert pointers, you do.

Seva Alekseyev
And essentially the only time you ever need to convert pointers in such a way that it matters, is when serializing data (i.e. file or network I/O). But in that case, you really do have to worry, since the reader may use a different endianness.
Josef Grahn
+1  A: 

First of all, you may already know it but let me mention that the size of int is not guaranteed to be 4 bytes and that of short, 2 bytes across all platforms.

If in your first question you mean something like this:

int i = ...;
short s = (short)i;

then yes, s will contain the lower byte(s) of i.

I think the answer to your second question is also yes; at the byte level the endianness of the system does come into play.

Péter Török
+3  A: 

Am I right that in both short integer will consist of 2 least significant bytes of this 4Byte integer?

Yes, by definition.

The difference between bigE and littleE is whether the least significant byte is at the lowest address or not. On a little endian processor, the lowest addresses are the least significant bits, x86 does it this way.

These give the same result on little E.

short s = (short)i;
short s = *(short*)&i;

On a big endian processor, the highest addresses are the least significant bits, 68000 and Power PC do it this way (actually Power PC can be both, but PPC machines from Apple use bigE)

These give the same result on big E.

short s = (short)i;
short s = ((short*)&i)[1]; // (assuming i is 4 byte int)

So, as you can see, little endian allows you to get at the least significant bits of an operand without knowning how big it is. little E has advantages for preserving backward compatibility.

So what's the advantage of big endian? It creates hex dumps that are easier to read.

Really, the engineers at Motorola thought that easing the burden of reading hex dumps was more important than backward compatibility. The engineers at Intel believed the opposite.

John Knoeller
I'm not sure the backward compatibility argument is very strong, since programs that incorrectly assume that some variables have a specific size will more than likely break for other reasons, even on a little endian machine (e.g. when dealing with arrays). I think it's got more to do with making certain low level optimization and other tricks easier, inside the compiler or hardware.
Josef Grahn
@Josef: Yes, I'm sure that there were lots of little reasons why little E was preferred by hardware designers, and compatibility may not have been that important to them. But it turns out that it DID matter a lot when moving from the 80886 to the 286/386. code written for the 8088 still runs on modern x86/x64 processors.
John Knoeller
Yes, the x86 architectures are backwards compatible at the binary level, by design (for good and bad), by always retaining the entire instruction set of previous generations. Admittedly, this borders my area of expertise, but does little endian help much in this area, other than perhaps simplifying the circuitry logic slightly for the mov instructions? You still have different opcodes for different data sizes (byte, word, dword etc.).
Josef Grahn
Big E and Little E decisions have little to do with backwards compatibility. In all processors keeping the same Endianess as the previous version helps promote code reuse. The choice of Big E vs. Little E on the original processors has more to do with Human Readability (Big E) vs. simplifying circuitry (Little E). Intel chose Little E, Motorola chose Big E. Many current processor can be switched on the fly.
Thomas Matthews
It's really two sides of the same coin. It's much more difficult _and_ requires more circuitry if you want to support a 16 bit instruction set on a 32 bit processor using bigE rather than littleE. be backward.
John Knoeller
I still don't see how backwards compatibility gets into the picture. Big endian processors support "smaller than word" memory operations as well, with or without the legacy. The x86 backwards compatibility has more to do with its variable instruction length.
Josef Grahn
@John Knoeller: `"It's much more difficult and requires more circuitry if you want to support a 16 bit instruction set on a 32 bit processor using bigE rather than littleE"` *Can you say why?*
Lazer
@Lazer: I said why. On BigE, you can't load the low 16 bits of a value unless you know the address _and the size_. On LittleE you only need to know the address. BigE seems more 'natural' to us because we assign more significance to the leftmost digits of numbers. They grow to the left while other data grows to the right. BigE caters to that little bit of irrationality, and there a price to be paid in complexity. LittleE instead caters to needs of the hardware at the expense of humans. It wouldn't exist if it wasn't better at that than BigE
John Knoeller
A: 

You should be aware that your second example

int i = some_number;  
short s = *(short*)&i;

is not valid C code as it violates strict aliasing rules. It is likely to fail under some optimization levels and/or compilers.

Use unions for that:

union {
   int   i;
   short s;
} my_union;

my_union.i = some_number;
printf("%d\n",my_union.s);

Also, as others noted, you can't assume that your ints will be 4 bytes. Better use int32_t and int16_t when you need specific sizes.

asr
How does it violate strict aliasing?
Roger Pate
Writing to one union member and reading from another is UB.
dalle
this is not an aliasing violation. it just subverts the type system, it never actually creates a pointer.
John Knoeller
A: 

If you really want to convert an int to a short, then just do that:

short int_to_short(int n) {
  if (n < SHRT_MIN) return SHRT_MIN;
  if (n > SHRT_MAX) return SHRT_MAX;
  return (short)n;
}

You don't have to even worry about endian, the language handles that for you. If you are sure n is within the range of a short, then you can skip the check, too.

Roger Pate
I'm not worried, I was just curious what the result would be.
Tomek Tarczynski