views:

1849

answers:

2

The GCC toolchain uses AT&T assembler syntax by default, but support for Intel syntax is available via the .intel_syntax directive.

Additionally, both AT&T and Intel syntax are available in a prefix and a noprefix version, which differ in whether or not they require to prefix register names with a % sigil.

Depending on which directives are present, the format for address constants changes.

Let's consider the following C code

*(int *)0xdeadbeef = 0x1234;

Using objdump -d, we find that it's compiled to the following assembler instruction

movl $0x1234,0xdeadbeef

As there are no registers involved, this is the correct syntax for both .att_syntax prefix and .att_syntax noprefix, ie. embedded in C code, they look like this

__asm__(".att_syntax prefix");
__asm__("movl $0x1234,0xdeadbeef");

__asm__(".att_syntax noprefix");
__asm__("movl $0x1234,0xdeadbeef");

You can optionally surround the address constant with parentheses, ie.

__asm__("movl $0x1234,(0xdeadbeef)");

will work as well.

When adding a sigil to a plain address constant, the code will fail to copile

__asm__("movl $0x1234,$0xdeadbeef"); // won't compile

When surrounding this expression with paranthesis, the compiler will emit wrong code without warning, ie

__asm__("movl $0x1234,($0xdeadbeef)"); // doesn't warn, but doesn't work!

This will incorrectly emit the instruction

movl $0x1234,0x0

In Intel mode, an address constant has to be prefixed with a segment register as well as the operand size and the PTR flag if ambiguity is possible. On my machine (an Intel dual core laptop with Windows XP and current MinGW and Cygwin GCC versions), the register ds is used by default.

Square brackets around the constant are optional. The address constant is also correctly recognized if the segment register is omitted, but the brackets are present. Omitting the register emits a warning on my system, though.

In prefix mode, the segment register has to be prefixed with %, but only using brackets will still work. These are the different ways to generate the correct instruction:

__asm__(".intel_syntax noprefix");
__asm__("mov DWORD PTR ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!

__asm__(".intel_syntax prefix");
__asm__("mov DWORD PTR %ds:0xdeadbeef,0x1234");
__asm__("mov DWORD PTR %ds:[0xdeadbeef],0x1234");
__asm__("mov DWORD PTR [0xdeadbeef],0x1234"); // works, but warns!

Omitting both segment register and brackets will fail to compile

__asm__("mov DWORD PTR 0xdeadbeef,0x1234"); // won't compile

I'll mark this question as community wiki, so if you have anything useful to add, feel free to do so.

+3  A: 

The noprefix/prefix directives only control whether registers require a % prefix(*) (at least it seems so and that's the only difference the documentation mentions). Value literals always need a $ prefix in AT&T syntax and never in Intel syntax. So the following works:

__asm__(".intel_syntax prefix");
__asm__("MOV [DWORD PTR 0xDEADBEEF], 0x1234");

If you are really inclined to use Intel syntax inline assembly within C code compiled with GCC and assembled with GAS, do not forget to also add the following after it, so that the assembler can grok the rest of the (AT&T syntax) assembly generated by GCC:

__asm__(".att_syntax prefix");

The reasoning I see for the prefix/noprefix distinction is, that for AT&T syntax, the % prefix is not really needed for registers on Intel architecture, because registers are named. But for uniformity it can be there because some other architectures (i.e. SPARC) have numbered registered, in which case specifying a low number alone would be ambiguous as to whether a memory address or register was meant.

Tom Alsberg
A: 

Here are my own results:

*(int *)0xdeadbeaf = 0x1234; // reference implementation

// AT&T: addresses without sigil; parentheses are optional

__asm__(".att_syntax prefix");
__asm__("movl $0x1234,0xdeadbeaf");     // works
__asm__("movl $0x1234,(0xdeadbeaf)");   // works
__asm__("movl $0x1234,($0xdeadbeaf)");  // doesn't work, doesn't warn!
//__asm__("movl $0x1234,$0xdeadbeaf");  // doesn't compile
//__asm__("movl 0x1234,0xdeadbeaf");    // doesn't compile
//__asm__("movl 0x1234,(0xdeadbeaf)");  // doesn't compile

__asm__(".att_syntax noprefix");
// same as above: no registers used!

// Intel: addresses with square brackets or segment register prefix
// brackets without prefix will warn

__asm__(".intel_syntax noprefix");
__asm__("mov DWORD PTR ds:0xdeadbeaf,0x1234");      // works
__asm__("mov DWORD PTR ds:[0xdeadbeaf],0x1234");    // works
__asm__("mov DWORD PTR [0xdeadbeaf],0x1234");       // works, but warns!
//__asm__("mov DWORD PTR 0xdeadbeaf,0x1234");       // doesn't compile

// `prefix` will add % to register names

__asm__(".intel_syntax prefix");
__asm__("mov DWORD PTR %ds:0xdeadbeaf,0x1234");     // works
__asm__("mov DWORD PTR %ds:[0xdeadbeaf],0x1234");   // works
__asm__("mov DWORD PTR [0xdeadbeaf],0x1234");       // works, but warns!
//__asm__("mov DWORD PTR 0xdeadbeaf,0x1234");       // doesn't compile

__asm__(".att_syntax prefix");
Christoph
As I noted in my answer, I do not think that noprefix and prefix are any different in this regard. The docs only says it affects % in front of register names. So the same block under noprefix should work under prefix too.
Tom Alsberg
@Tom: unfortunately, that's not the case - the `noprefix` version will emit an "Error: Unexpected token `:'" when used with `prefix`
Christoph
@Tom: the version which emits a warning will work, though...
Christoph
Is that not because the other two noprefix examples include a register (ds), so when using it with prefix you need to add the % to that?
Tom Alsberg
@Tom: nice - so the syntax actually makes sense! Do you want to summarize or shall I update the question to include these results?
Christoph
You summarized it better than I would. Cannot try right now but I might just edit my answer when I get to try it all out and have a generalization or comparison with the what MASM/TASM traditionally accept in Intel syntax.Kudos on the detailed edits to your question into such a good explanation!
Tom Alsberg