views:

217

answers:

5
#include <stdio.h>

int main(void){
  unsigned a[3][4] = {
    {2,23,6,7},
    {8,5,1,4},
    {12,15,3,9}
 };
 printf("%u",*((int*)(((char*)a)+4)));
 return 0;
}

The output in my machine is the value at a[0][1] i.e 23.Could somebody explain how is this working ?

Edit: Rolling Back to old yucky code,exactly what was presented to me :P

+10  A: 

So you have your array in memory as so:

2, 23, 6, 7, 8...

What this does is cast the array to a char*, which lets you access individual bytes, and it points here:

2, 23, 6, 7, 8...
^

It then adds four bytes, moving it over to the next value (more on this later).

2, 23, 6, 7, 8...
   ^

Then it turns it into an int* and dereferences it, getting the value 23.


There are technically three things wrong with this code.

The first is that it assumes that an unsigned is 4 bytes in size. (Hence the + 4). But this isn't necessarily true! Better would have been + sizeof(unsigned), ensuring correctness no matter what size unsigned happens to be.

The second problem is the cast to int: the original array was unsigned, but the value is being cast to an int. There exists values in the unsigned range that int cannot represent (because in an int half of the range is in the negatives.) So if one of the values in the array was not representable as an int (meaning the value was greater than INT_MAX), you'd get the wrong value. Better would be to convert to unsigned*, to maintain the correct type.

The last thing is the format specifier. The specifier for integers is %d, but the code uses %u, which is for unsigned integers. In effect, even though casting back to int* was wrong, printf is going to cast that value back into an unsigned*, restoring it's integrity. By fixing problem two, problem three fixes itself.

There is a hidden fourth problem: The code sucks. This may be for learning purposes, but yuck.

GMan
Got it ! Thanks :)
nthrgeek
@nthgreek: Pointer arithmetic: http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/pointer.html
Felix Kling
@GMan : `There is a hidden fourth problem: The code sucks. This may be for learning purposes, but yuck.` -- Agreed :P
nthrgeek
Thanks,Felix :)
nthrgeek
another way to fix first problem instead of adding sizeof(unsigned) : cast to (unsigned*) instead of (char*) and add 1 (move address by one unsigned)... it will also avoid the second problem (no need to cast again) and the third one... and arguably the fourth ? (well maybe not... direct use of a[0][1] is obviously still better) ;-)
kriss
@kriss: But then all you're doing is fixing the code. :P
GMan
@GMan: well, yes. That's also what you do with the sizeof bit.
kriss
+2  A: 

It first implicitly converts the array a into a pointer to its beginning. Then it casts the pointer to char* and increments the value by 4. The value 4 happens to be the same as sizeof(unsigned) on your system, so actually it has moved one element forward from the beginningn. Then it casts the address to int* and reads the value pointed by it (operator*). This resulting value is printed as unsigned integer, which works because int and unsigned are same size.

The layout of the static 2D array in memory is so that all the elements are actually stored in sequence as a one-dimensional array.

Tronic
+1  A: 

unsigned int is of size 4. i.e. sizeof(unsigned) == 4

it can hold 4 chars, each of which is a byte [in C not in Java/C# etc.].

Array is allocated consecutively in memory. When you treat unsigned array as char* you need to move the pointer 4 steps to reach next unsigned value in array.

Fakrudeen
`sizeof(unsigned) == 4` *in this case*, not necessarily everywhere. That may be what you meant to say.
GMan
+1  A: 

First, you create a 2-dim array with size 3x4.

After ((char*)a) you can work with this as a char array. Let's designate it as b.

((char*)a)+4 is the same as b[4], it points to the 5-th element of char array (you remember, that aarays in C are 0-based). Or just 5-th byte.

When you convert the array back to int, i-th element of int array begins from i*4 byte if sizeof(int) = 4. So, on the 5-th byte the second element of int array begins that's where your pointer points. The compiler gets 4 bytes beginning from 4-th position and says it's int. That's exactly a[0][1].

flashnik
+7  A: 

The array:

unsigned a[3][4] = {
    {2,23,6,7},
    {8,5,1,4},
    {12,15,3,9}
};

is laid out in memory as (assuming a itself is at memory location 0x8000, a particular endian-ness and for a four-byte int):

0x8000  0  0  0  2
0x8004  0  0  0 23
0x8008  0  0  0  6
0x800C  0  0  0  7
0x8010  0  0  0  8
0x8014  0  0  0  5
0x8018  0  0  0 14
0x801C  0  0  0 12
0x8020  0  0  0 15
0x8024  0  0  0  3
0x8028  0  0  0  9

Breaking down the expression:

*((int*)(((char*)a)+4))
  • ((char*)a) gives you a char pointer.
  • +4 advances that pointer by 4 bytes (4 * sizeof(char))
  • (int*) turns the result of that back into an int pointer.
  • * dereferences that pointer to extract the int.

This is a very silly way of doing it since it's inherently non-portable (to environments where an int is two or eight bytes, for example).

paxdiablo
+1 Just because your array memory layout looks awesome-sauce compared to mine.
GMan
Nice explanation +1 ! :)
nthrgeek
@ GMAN: yes indeed paxdiablo's `array memory layout looks awesome` but I got the solution after reading this line only `What this does is cast the array to a char*, which lets you access individual bytes` :)
nthrgeek
@nthgeek: Now the only thing to keep in mind is that while `char` is the smallest unit to the compiler, you also can't necessarily assume it's 8-bits. And while @pax mentioned `4 * sizeof(char)`, because `char`'s are the smallest `sizeof(char)` is *always* one.
GMan
It's not relevant whether a char is 8 bits. A byte in the ISO standard is defined as the width of a char rather than as 8 bits. The only reason I mentioned sizeof(char) is because people often erroneously believe that it may not always be 1 in some cases and I was showing the relationship between the sizes of int and char. But your point is a good one. In any case, the layout is also irrelevant for the accepted answer since it was GMan's that twigged the OP as to why it works hence rightly the accepted one. All I'll hope for is some upvotes in the future because my answer is still helpful :-)
paxdiablo