ansaurus

Question

Trouble understanding gcc's assembly output

Answer 1

A:

I'm not sure what's not to understand, unless I'm missing something.

The first 3 instructions load a byte from old_string into dx and stores that to your new_string.

The next 3 instructions utilize what's already in dx and combines old_string[i+1] with it, and stores it as a 16-bit value (ax) to new_string.

Jim Buck 2009-05-24 02:48:25

both sections move a 16 bit value to the same location in memory: movw %dx, (%ecx,%ebx,2) then movw %ax, (%ecx,%ebx,2) neither ecx or ebx changes between these instructions.

Carson Myers 2009-05-24 03:01:42

yes, because your C code is storing to new_string[x] twice - i would hope the memory location does not change :)

Jim Buck 2009-05-24 14:57:19

I misunderstood the data being moved because of the use of 32 and 16 bit registers to store 16 and 8 bit values. I wrongfully assumed that it was storing the same 8 bit value in ax to the same 8 bit location in memory twice. It was a mistake of data size.

Carson Myers 2009-05-25 06:35:37

Answer 2

A:

Also, it shifts old_string[i+1] to the high-order dword of eax, then ors edx (new_string[x]) into it... then puts ax into the memory! Wouldn't ax just contain what was already in new_string[x]? so it saves the same thing to the same place in memory twice?

Now you see why optimizers are a Good Thing. That kind of redundant code shows up pretty often in unoptimized, generated code, because the generated code comes more or less from templates that don't "know" what happened before or after.

Charlie Martin 2009-05-24 02:50:14

this is compiled with the -O3 flag. Also, the redundancy isn't why I'm perplexed, it's the fact that it should be moving 16 bit value 01 into 32 bit value, then bit-shifting and or-ing 16 bit value 02 into the same 32 bit value, creating 0201 in the 32 bit value... but it looks like it's just putting 01 into the same place twice, leaving xx01 as the 32 bit value.

Carson Myers 2009-05-24 02:56:17

Since it's char* and short*, it's dealing with 8-bit and 16-bit values, not 16-bit and 32-bit values.

Jim Buck 2009-05-24 02:58:40

but eax is 32 bit and ax is 16 bit, right? The code is using 32 bit and 16 bit registers, rather than using say, ax to handle a short, and al to handle a char.

Carson Myers 2009-05-24 03:05:37

Actually, the reason it's doing this redundant storing is because according to C's alias rules, the two arrays may overlap (char* aliases all types), and so the compiler must be sure to save the value of new_string[i] before dereferencing old_string[i+1]. This can be avoided with proper application of restrict.

bdonlan 2009-05-24 04:23:52

ah--I'm compiling C89 though--but say I use C99--one of the arrays is a function argument. Does that change anything? I switched to C99 to try it, and put restrict on new_string (old_string is the parameter to this function). It still did both memory writes, and when I try and put "restrict" on old_string it says it's an invalid use.

Carson Myers 2009-05-24 04:48:33

Answer 3

+1 A:

Okay, so it was pretty simple after all. I figured it out with a pen and paper, writing down each step, what it did to each register, and then wrote down the contents of each register given an initial starting value...

What got me was that it was using 32 bit and 16 bit registers for 16 and 8 bit data types... This is what I thought was happening:

first value put into memory as, say, 0001 (I was thinking 01).
second value (02) loaded into 32 bit register (so it was like, 00000002, I was thinking, 0002)
second value shifted left 8 bits (00000200, I was thinking, 0200)
first value (0000001, I thought 0001) xor'd into second value (00000201, I thought 0201)
16 bit register put into memory (0201, I was thinking, just 01 again).

I didn't get why it wrote it to memory twice though, or why it was using 32 bit registers (well, actually, my guess is that a 32 bit processor is way faster at working with 32 bit values than it is with 8 and 16 bit values, but that's a totally uneducated guess), so I tried rewriting it:

movl -20(%ebp), %esi       #gets pointer to old_string
movsbw (%edi,%esi),%dx     #old_string[i] -> dx (0001)
movsbw 1(%edi,%esi),%ax    #old_string[i + 1] -> ax (0002)
salw $8, %ax               #shift ax left (0200)
orw %dx, %ax               #or dx into ax (0201)
movw %ax,(%ecx,%ebx,2)     #doesn't write to memory until end

This worked exactly the same.

I don't know if this is an optimization or not (aside from taking one memory write out, which obviously is), but if it is, I know it's not really worth it and didn't gain me anything. In any case, I get what this code is doing now, thanks for the help all.

Carson Myers 2009-05-24 03:43:51

See bdonian's comment to Charlie Martin's answer -- it's writing to memory twice in case new_string[i] happens to refer to the same memory location as old_string[i+1] (so-called _aliasing_). You can eliminate the redundant store with the proper use of the C99 keyword 'restrict'.

Adam Rosenfield 2009-05-24 04:35:07

ansaurus

tags:

views:

answers:

Trouble understanding gcc's assembly output

related questions