views:

1214

answers:

2

I have a piece of C code that calls a function defined in assembly. By way of example, let's say foo.c contains:

int bar(int x);  /* returns 2x */
int main(int argc, char *argv[]) { return bar(7); }

And bar.s contains the implementation of bar() in x86 assembly:

.global bar
bar:    movl 4(%esp), %eax
        addl %eax, %eax
        ret

On Linux I can easily compile and link these sources with GCC as follows:

% gcc -o test foo.c bar.s
% ./test; echo $?
14

On Windows with MinGW this fails with an error of "undefined reference to `bar'". It turns out the cause for this is that on Windows all identifiers of functions with C calling convention are prefixed with an underscore, but since "bar" is defined in assembly, it doesn't get this prefix and linking fails. (So the error message is actually complaining about missing the symbol _bar, not bar.)

To summarize:

% gcc -c foo.c bar.s
% nm foo.o bar.o
foo.o:
00000000 b .bss
00000000 d .data
00000000 t .text
         U ___main
         U _bar
00000000 T _main

bar.o:
00000000 b .bss
00000000 d .data
00000000 t .text
00000000 T bar

The question now is: how can I resolve this nicely? If I were writing for Windows only, I could just add the underscore to the identifier in bar.s, but then the code breaks on Linux. I have looked at gcc's -fleading-underscore and -fno-leading-underscore options but neither appears to do anything (at least on Windows).

The only alternative I see now is passing the assembly file through the C preprocessor and redefining all the declared symbols manually if WIN32 is defined, but that's not very pretty either.

Does anyone have a clean solution for this? Perhaps a compiler option I oversaw? Maybe the GNU assembler supports a way to specific that this particular symbol refers to a function using C calling convention and should be mangled as such? Any other ideas?

+1  A: 

can you declare it twice?

.global bar
.global _bar

I haven't written assembly in awhile, but does the .global identifier just act sort of like a label?

Carson Myers
Yep, that works too.
ephemient
The .global directive only specifies that this identifier refers to a global symbol, so that could be made to work if I also define two labels for the function, e.g.: .global bar .global _bar bar: _bar: <etc>Aside from the duplication, I also get a useless "bar" symbol on Windows, and a useless "_bar" symbol on Linux. I was hoping for something cleaner, but this does work, so I thank you for the suggestion.
+2  A: 

One option, though dangerous, is to convince GCC to omit the ABI-required leading underscore.

  • -fleading-underscore

    This option and its counterpart, -fno-leading-underscore, forcibly change the way C symbols are represented in the object file. One use is to help link with legacy assembly code.

    Warning: the -fleading-underscore switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. Not all targets provide complete support for this switch.

Another, safer option, is to explicitly tell GCC the name to use.

5.39 Controlling Names Used in Assembler Code

You can specify the name to be used in the assembler code for a C function or variable by writing the asm (or __asm__) keyword after the declarator as follows:

     int foo asm ("myfoo") = 2;

This specifies that the name to be used for the variable foo in the assembler code should be \myfoo`' rather than the usual \_foo`'.

On systems where an underscore is normally prepended to the name of a C function or variable, this feature allows you to define names for the linker that do not start with an underscore.

It does not make sense to use this feature with a non-static local variable since such variables do not have assembler names. If you are trying to put the variable in a particular register, see Explicit Reg Vars. GCC presently accepts such code with a warning, but will probably be changed to issue an error, rather than a warning, in the future.

You cannot use asm in this way in a function definition; but you can get the same effect by writing a declaration for the function before its definition and putting asm there, like this:

 extern func () asm ("FUNC");

 func (x, y)
      int x, y;
 /* ... */

It is up to you to make sure that the assembler names you choose do not conflict with any other assembler symbols. Also, you must not use a register name; that would produce completely invalid assembler code. GCC does not as yet have the ability to store static variables in registers. Perhaps that will be added.

In your case,

extern int bar(int x) asm("bar");

should tell GCC that "bar uses asm name \`bar', even though it's a ccall function".

ephemient
Does this mean that the default behavior of GCC on Linux should be that the "C" names have a leading underscore? If so, there's something in the OP's build environment that's turning it off?
Michael Burr
On Linux, functions following the standard C calling convention (ccall, cdecl, whatever you want to call it) are not decorated. On Windows, stdcall is the "default" calling convention, and functions following anything else (like the standard C calling convention) are decorated.
ephemient
As I said, the -fleading-underscore and -fno-leading-underscore options didn't seem to do anything (they neither remove the underscores in the C functions nor add them for the assembly symbols); if you Google around for a bit you'll see that others had the same experience so I get the impression these options are pretty useless.The asm() suggestion is a good one; I may well end up using that. The only downside is that the symbol itself still doesn't have the proper name (which would be _bar on Windows) but at least I can link on both platforms without further source code modifications.