tags:

views:

489

answers:

3

While writing some C code, I decided to compile it to assembly and read it--I just sort of, do this from time to time--sort of an exercise to keep me thinking about what the machine is doing every time I write a statement in C.

Anyways, I wrote these two lines in C

asm(";move old_string[i] to new_string[x]");
new_string[x] = old_string[i];
asm(";shift old_string[i+1] into new_string[x]");
new_string[x] |= old_string[i + 1] << 8;

(old_string is an array of char, and new_string is an array of unsigned short, so given two chars, 42 and 43, this will put 4342 into new_string[x])
Which produced the following output:

#move old_string[i] to new_string[x]

movl    -20(%ebp), %esi         #put address of first char of old_string in esi
movsbw  (%edi,%esi),%dx         #put first char into dx
movw    %dx, (%ecx,%ebx,2)      #put first char into new_string

#shift old_string[i+1] into new_string[x]

movsbl  1(%esi,%edi),%eax       #put old_string[i+1] into eax
sall    $8, %eax                #shift it left by 8 bits
orl     %edx, %eax              #or edx into it
movw    %ax, (%ecx,%ebx,2)      #?

(I'm commenting it myself, so I can follow what's going on). I compiled it with -O3, so I could also sort of see how the compiler optimizes certain constructs. Anyways, I'm sure this is probably simple, but here's what I don't get:

the first section copies a char out of old_string[i], and then movw's it (from dx) to (%ecx,%ebx). Then the next section, copies old_string[i+1], shifts it, ors it, and then puts it into the same place from ax. It puts two 16 bit values into the same place? Wouldn't this not work?

Also, it shifts old_string[i+1] to the high-order dword of eax, then ors edx (new_string[x]) into it... then puts ax into the memory! Wouldn't ax just contain what was already in new_string[x]? so it saves the same thing to the same place in memory twice?

Is there something I'm missing? Also, I'm fairly certain that the rest of the compiled program isn't relevant to this snippet... I've read around before and after, to find where each array and different variables are stored, and what the registers' values would be upon reaching that code--I think that this is the only piece of the assembly that matters for these lines of C.

-- oh, turns out GNU assembly comments are started with a #.

A: 

I'm not sure what's not to understand, unless I'm missing something.

The first 3 instructions load a byte from old_string into dx and stores that to your new_string.

The next 3 instructions utilize what's already in dx and combines old_string[i+1] with it, and stores it as a 16-bit value (ax) to new_string.

Jim Buck
both sections move a 16 bit value to the same location in memory: movw %dx, (%ecx,%ebx,2) then movw %ax, (%ecx,%ebx,2) neither ecx or ebx changes between these instructions.
Carson Myers
yes, because your C code is storing to new_string[x] twice - i would hope the memory location does not change :)
Jim Buck
I misunderstood the data being moved because of the use of 32 and 16 bit registers to store 16 and 8 bit values. I wrongfully assumed that it was storing the same 8 bit value in ax to the same 8 bit location in memory twice. It was a mistake of data size.
Carson Myers
A: 

Also, it shifts old_string[i+1] to the high-order dword of eax, then ors edx (new_string[x]) into it... then puts ax into the memory! Wouldn't ax just contain what was already in new_string[x]? so it saves the same thing to the same place in memory twice?

Now you see why optimizers are a Good Thing. That kind of redundant code shows up pretty often in unoptimized, generated code, because the generated code comes more or less from templates that don't "know" what happened before or after.

Charlie Martin
this is compiled with the -O3 flag. Also, the redundancy isn't why I'm perplexed, it's the fact that it should be moving 16 bit value 01 into 32 bit value, then bit-shifting and or-ing 16 bit value 02 into the same 32 bit value, creating 0201 in the 32 bit value... but it looks like it's just putting 01 into the same place twice, leaving xx01 as the 32 bit value.
Carson Myers
Since it's char* and short*, it's dealing with 8-bit and 16-bit values, not 16-bit and 32-bit values.
Jim Buck
but eax is 32 bit and ax is 16 bit, right? The code is using 32 bit and 16 bit registers, rather than using say, ax to handle a short, and al to handle a char.
Carson Myers
Actually, the reason it's doing this redundant storing is because according to C's alias rules, the two arrays may overlap (char* aliases all types), and so the compiler must be sure to save the value of new_string[i] before dereferencing old_string[i+1]. This can be avoided with proper application of restrict.
bdonlan
ah--I'm compiling C89 though--but say I use C99--one of the arrays is a function argument. Does that change anything? I switched to C99 to try it, and put restrict on new_string (old_string is the parameter to this function). It still did both memory writes, and when I try and put "restrict" on old_string it says it's an invalid use.
Carson Myers
+1  A: 

Okay, so it was pretty simple after all. I figured it out with a pen and paper, writing down each step, what it did to each register, and then wrote down the contents of each register given an initial starting value...

What got me was that it was using 32 bit and 16 bit registers for 16 and 8 bit data types... This is what I thought was happening:

  • first value put into memory as, say, 0001 (I was thinking 01).
  • second value (02) loaded into 32 bit register (so it was like, 00000002, I was thinking, 0002)
  • second value shifted left 8 bits (00000200, I was thinking, 0200)
  • first value (0000001, I thought 0001) xor'd into second value (00000201, I thought 0201)
  • 16 bit register put into memory (0201, I was thinking, just 01 again).

I didn't get why it wrote it to memory twice though, or why it was using 32 bit registers (well, actually, my guess is that a 32 bit processor is way faster at working with 32 bit values than it is with 8 and 16 bit values, but that's a totally uneducated guess), so I tried rewriting it:

movl -20(%ebp), %esi       #gets pointer to old_string
movsbw (%edi,%esi),%dx     #old_string[i] -> dx (0001)
movsbw 1(%edi,%esi),%ax    #old_string[i + 1] -> ax (0002)
salw $8, %ax               #shift ax left (0200)
orw %dx, %ax               #or dx into ax (0201)
movw %ax,(%ecx,%ebx,2)     #doesn't write to memory until end

This worked exactly the same.

I don't know if this is an optimization or not (aside from taking one memory write out, which obviously is), but if it is, I know it's not really worth it and didn't gain me anything. In any case, I get what this code is doing now, thanks for the help all.

Carson Myers
See bdonian's comment to Charlie Martin's answer -- it's writing to memory twice in case new_string[i] happens to refer to the same memory location as old_string[i+1] (so-called _aliasing_). You can eliminate the redundant store with the proper use of the C99 keyword 'restrict'.
Adam Rosenfield