tags:

views:

928

answers:

4

At school we have been programming in MIPS assembly language for some time. I'm interested into delving into x86 assembly and I have heard that is somewhat harder (even my MIPS textbook says this).

What core information should I know as a MIPS programmer before making the dive into the x86 world?

+5  A: 

x86 has a very limited set of available registers compared to most other architectures. This doesn't really make the assembly language any harder to learn, but sometimes makes it harder to implement code in practice.

Also, because of the x86 history of strong backward compatibility, the instruction set is not terribly symmetric (definitely pre-RISC) and there can be lots of exceptions to the rule and corner cases to pay attention to.

Greg Hewgill
Yeah, but it's not that limited compared to MIPS. :)
BobbyShaftoe
@BobbyShaftoe, are you an assembly programmer? MIPS has 32 general purpose registers and x86 has 8 as far as I know.
Simucal
+7  A: 

The biggest things to keep in mind are:

  • Few general purpose registers, and the ones you do have are not pure GP -- many instructions require you to use certain registers for a specific purpose.
  • x86 instructions are two-opcode form rather than three-opcode which can make certain operations more complex. That is, instead of add r0, r1, r2 (r0 = r1 + r2), you do add eax, ebx (eax += ebx).
  • Segments in protected mode (all 32-bit code outside of DOS, effectively) make your memory addressing scheme extremely non-obvious, which can bite you in the ass when you're starting out.
  • You're going to be looking up the flags set/cleared by instructions all the time. Learn to love the Intel manuals.
  • Edit, one thing I forgot: The use of subregisters (e.g. ah to access the high 8 bits of the low 16-bits of the eax register) can make tracking manipulations to your registers very difficult. Be careful and comment liberally until you get things down.

Other than that, x86 is pretty straight forward. When you learn to abuse instructions like 'lea' and 'test', you learn to love it. Also, protip: Intel will send you copies of the instruction set manuals for free, don't even have to pay for shipping. Look around their site for the fulfillment email and request the books by SKU.

Cody Brocious
+1  A: 

x86 have more complex instructions than MIPS. So there is probably a single instruction for common sequences in MIPS (most notably memory addressing). Lack of numerous registers are certainly a disadvantage but in both architectures there are conventions which pretty much restricts the number of what you can use freely down to 4-5. Just more pronounced in x86. x86 have more exceptions for register usage than MIPS that you have to keep in mind but nothing worth whining about constantly.

Speaking from experience, either language has about the same difficulty to learn, conventions included. Maybe x86 is a tad easier, considering abundant online resources and its popularity.

The difficult part about x86 is generating binary, because of its variable length instructions and several addressing modes. Most often, you don't ever need to do it anyway.

I can certainly recommend you learning a more complex instruction architecture than MIPS.

And, this is important, don't be a part of the religious war between RISC v.s. CISC...

artificialidiot
A: 

I've been learning x86 and x86_64 to write an assembler myself. If you aren't going to write an assembler yourself then some of what I will tell is pretty much useless. I don't know about MIPS myself though.

x86 indirect addressing is a complex thing. In a single instruction, you can do these:

mov reg, [reg+offset]
mov reg, [reg*scale+base register+offset] # in where scale can be 1, 2, 4 or 8.

Their instruction encoding is complex because of this, but it's consistent for every instruction that encodes this way. You might be wanting to read this from sandpile.org. If you want to know more about encoding, you can always ask about it from me. Another instruction encoding related annoying detail are the prefixes. They change the meaning of the instruction a lot. For instance, 0x66 (if I remember right) in front and some instructions become for 16bit GPRs instead of 32bit ones.

32bit GPRs(in order): eax, ecx, edx, ebx, esp, ebp, esi, edi

64bit GPRs: rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15

Notice how few general purpose registers there are, this will force most software to use it more or less in a stack-machine mannered way. A painful detail. rsp is used for the stack (pop, push -instructions), and rbp tends to be reserved as well. x86_64 has more registers, but it'll take time when people will adopt it, even if every single of consumers had a processor capable to it.

There's two different instruction sets for floating point arithmetic. XMM being the newer. In x86_64 there's 16 128bit registers available and in x86 there's only 8 of them. The older instruction set handles registers as a stack. You just don't have swap, nip or rot, so working with it is mind-bending.

In use x86 tends to reduce into a RISC machine. Some of those complex instructions do not give benefits or are even slower on newer machines. You will do with understanding about 30-150 instructions depending about what you are reading or writing. You can also completely ignore some old instructions and AL/HL -stuff. Keep in mind this all clutter origins behind 1978, which is quite surprising it's not worse, 31 years from that and 24 years from first introduction of IA-32. Lots of things change their relevance in that time.

Direct jumps and calls seem to be relative from the next instruction in x86. Therefore:

    jmp nowhere  # or call, jz, jg whatever...
nowhere:
    nop

Ends up encoded to 'JMP imm:0, NOP'. The register-indirect jmp that does do absolute jumps. It's also good to notice there aren't register-indirect conditional jumps, it bothered me too.

Here's not everything possible you should know but first stuff that comes into my mind from your question. But perhaps you can get along with these for now.

Cheery