Inspired by the question Difference in initalizing and zeroing an array in c/c++ ?, I decided to actually examine the assembly of, in my case, an optimized release build for Windows Mobile Professional (ARM processor, from the Microsoft Optimizing Compiler). What I found was somewhat surprising, and I wonder if someone can shed some light on my questions concerning it.
These two examples are examined:
byte a[10] = { 0 };
byte b[10];
memset(b, 0, sizeof(b));
They are used in the same function, so the stack looks like this:
[ ] // padding byte to reach DWORD boundary
[ ] // padding byte to reach DWORD boundary
[ ] // b[9] (last element of b)
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] // b[0] = sp + 12 (stack pointer + 12 bytes)
[ ] // padding byte to reach DWORD boundary
[ ] // padding byte to reach DWORD boundary
[ ] // a[9] (last element of a)
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] // a[0] = sp (stack pointer, at bottom)
The generated assembly with my comments:
; byte a[10] = { 0 };
01: mov   r3, #0        // r3 = 0
02: mov   r2, #9        // 3rd arg to memset: 9 bytes, note that sizeof(a) = 10
03: mov   r1, #0        // 2nd arg to memset: 0-initializer
04: add   r0, sp, #1    // 1st arg to memset: &a[1] = a + 1, since only 9 bytes will be set
05: strb  r3, [sp]      // a[0] = r3 = 0, sets the first element of a
06: bl    memset        // continue in memset
; byte b[10];
; memset(b, 0, sizeof(b));
07: mov   r2, #0xA      // 3rd arg to memset: 10 bytes, sizeof(b)
08: mov   r1, #0        // 2nd arg to memset: 0-initializer
09: add   r0, sp, #0xC  // 1st arg to memset: sp + 12 bytes (the 10 elements
                        // of a + 2 padding bytes for alignment) = &b[0]
10: bl    memset        // continue in memset
Now, there are two things that confuses me:
- What's the point of lines 02 and 05? Why not just give &a[0] and 10 bytes to memset?
- Why isn't the padding bytes of a 0-initialized? Is that only for padding in structs?
Edit: I was too curious to not test the struct case:
struct Padded
{
    DWORD x;
    byte y;
};
The assembler for 0-initializing it:
; Padded p1 = { 0 };
01: mov   r3, #0
02: str   r3, [sp]
03: mov   r3, #0
04: str   r3, [sp, #4]
; Padded p2;
; memset(&p2, 0, sizeof(p2));
05: mov   r3, #0
06: str   r3, [sp]
07: andcs r4, r0, #0xFF
08: str   r3, [sp, #4]
Here we see in line 04 that a padding indeed occur, since str (as opposed to strb) is used. Right?