views:

128

answers:

3

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?

Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.

Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.

But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.

Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?

Added in response to one answer:

I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.

But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.

So the question about portability of assembler is even considerably more important than I had first realized.

Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.

+4  A: 

You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.

corprew
LLVM is the most serious VM out there, but may not be the best for a given application. It's worth looking at the competition, e.g., Parrot VM.
Charles Stewart
+1  A: 

It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.

The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Andrew McGregor
+2  A: 

At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.

Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.

ephemient