tags:

views:

2013

answers:

4

Given this field:

char lookup_ext[8192] = {0}; // Gets filled later

And this statement:

unsigned short *slt = (unsigned short*) lookup_ext;

What happens behind the scenes?

lookup_ext[1669] returns 67 = 0100 0011 (C), lookup_ext[1670] returns 78 = 0100 1110 (N) and lookup_ext[1671] returns 68 = 0100 0100 (D); yet slt[1670] returns 18273 = 0100 0111 0110 0001.

I'm trying to port this to C#, so besides an easy way out of this, I'm also wondering what really happens here. Been a while since I used C++ regularly.

Thanks!

+2  A: 

If I understand correctly, the type conversion will be converting a char array of size 8192 to a short int array of size half of that, which is 4096.

So I don't understand what you are comparing in your question. slt[1670] should correspond to lookup_ext[1670*2] and lookup_ext[1670*2+1].

PolyThinker
Right on. That solved everything! Thanks.
hb
+5  A: 

The statement that you show doesn't cast a char to an unsigned short, it casts a pointer to a char to a pointer to an unsigned short. This means that the usual arithmetic conversions of the pointed-to-data are not going to happen and that the underlying char data will just be interpreted as unsigned shorts when accessed through the slt variable.

Note that sizeof(unsigned short) is unlikely to be one, so that slt[1670] won't necessarily correspond to lookup_ext[1670]. It is more likely - if, say, sizeof(unsigned short) is two - to correspond to lookup_ext[3340] and lookup_ext[3341].

Do you know why the original code is using this aliasing? If it's not necessary, it might be worth trying to make the C++ code cleaner and verifying that the behaviour is unchanged before porting it.

Charles Bailey
Unsigned short's 16-bits on a 32-bit machine, while chars are 8-bits
Calyth
Usually, but they don't have to be. chars and unsigned shorts could all be 32-bits, for example.
Charles Bailey
A: 

When you do "unsigned short slt = (unsigned short) lookup_ext;", the no. of bytes equivalent to the size of (unsigned short) are picked up from the location given by lookup_ext, and stored at the location pointed to by slt. Since unsigned short would be 2 bytes, first two bytes from lookup_ext would be stored in the location given by slt.

Harty
This isn't true. See 8.5.1 [dcl.init.aggr]. If an `initializer-list` is shorter than the length of the array then the rest of the array will be value-initialized so will be zero, not junk.
Charles Bailey
I was looking at it from a pure C perspective :-)
Harty
It's the same deal with C. only in C the initializer can't be empty, while in C you can do = {}
Johannes Schaub - litb
+1  A: 

Well, this statement

char lookup_ext[8192] = {0}; // Gets filled later

Creates an array either locally or non-locally, depending on where the definition occurs. Initializing it like that, with an aggregate initializer will initialize all its elements to zero (the first explicitly, the remaining ones implicitly). Therefore i wonder why your program outputs non-zero values. Unless the fill happens before the read, then that makes sense.

unsigned short *slt = (unsigned short*) lookup_ext;

That will interpret the bytes making up the array as unsigned short objects when you read from that pointer's target. Strictly speaking, the above is undefined behavior, because you can't be sure the array is suitable aligned, and you would read from a pointer that's not pointing at the type of the original pointed type (unsigned char <-> unsigned short). In C++, the only portable way to read the value out of some other pod (plain old data. that's all the structs and simple types that are possible in C too (such as short), broadly speaking) is by using such library functions as memcpy or memmove.

So if you read *slt above, you would interpret the first sizeof(*slt) bytes of the array, and try to read it as unsigned short (that's called type pun).

Johannes Schaub - litb
It gets filled later; I didn't paste that part of the code because I didn't think it was really relevant to the problem itself. Sorry.
hb