views:

750

answers:

5

I'm hoping to learn assembly language for x86. I'm on a Mac, and I'm assuming most x86 tutorials/books use code that's meant for Windows.

How does the OS that code is run on affect what the code does, or determine whether the code even works? Could I follow a Windows-based tutorial, and modify a few commands to make it work for Mac with relative ease? More generally, is there anything tricky that a Mac assembly programmer, specifically, should know? Thanks!

+7  A: 

Generally speaking, as long as you use the same assembler, and the same architecture (for example, NASM, and x86-64), you should be able to assemble assembly on both Windows and Mac.

However, it is important to keep in mind that the executable formats and the execution environments may differ. As a example, Windows might emulate/handle certain privileged instructions differently to Mac, causing different behavior.

Matthew Iselin
+1  A: 

Also a big part of the difference is in how the program communicates with the outside world.

For example if you want to display a message to the user or read a file or allocate more memory you have to ask the OS to do it by making some kind of system call. That'll be quite different between OS's.

The language syntax itself should be basically identical as long as you're using the same assembler. Different assemblers sometimes have slightly different ordering on syntax or different macros but nothing that's too hard to get used to.

Colin Coghill
+3  A: 
Norman Ramsey
+1  A: 

When I dipped into Assembly during one of my programming tourist visits, the gotcha that held me up in every tutorial was not being able to compile in the correct binary format. Most tutorials give elf (for Linux) and aoutb (for BSD), yet with the latter (logical choice?) OS X complains:

ld: hello.o bad magic number (not a Mach-O file)

yet Mach-O fails as a format, and if you man nasm you get only bin, aout and elf file formats - man ld is no more helpful - macho is the option to make the Mach-O format for OS X:

nasm -f macho hello.asm

I wrote up the journey here (includes a link to a nice TextMate bundle for Assembly and other info), but - to be brief - the above is what you need to get started.

Dave Everitt
+6  A: 

(Of course, all of the following applies only to x86 and x86-64 assembly language, for IA-32 and AMD64 processors and operating systems.)

The other answers currently visible are all correct, but, in my opinion, miss the point. AT&T versus Intel syntax is a complete non-issue; any decent tool will work with both syntaxes or have a counterpart or replacement that does. And they assemble the same anyway. (Protip: you really want to use Intel syntax. All the official processor documentation does. AT&T syntax is just one giant headache.) Yes, finding the right flags to pass to the assembler and linker can be tricky, but you'll know when you've got it and you only have to do it once per OS (if you remember to write it down somewhere!).

Assembly instructions themselves, of course, are completely OS-agnostic. The CPU does not care what operating system it's running. Unless you're doing extremely low-level hackery (that is, OS development), the nuts and bolts of how the OS and CPU interact are almost totally irrelevant.

The Outside World

The trouble with assembly language comes when you interact with the outside world: the OS kernel, and other userspace code. Userspace is trickiest: you have to get the ABI right or your assembly program is all but useless. This part is generally not portable between OSes unless you use trampolines/thunks (basically another layer of abstraction that has to be rewritten for every OS you intend to support).

The most important part of the ABI is whatever the calling convention is for C-style functions. They're what are most commonly supported, and what you're probably going to be interfacing with if you're writing assembly. Agner Fog maintains several good resources on his site; the detailed description of calling conventions is particularly useful. In his answer, Norman Ramsey mentions PIC and dynamic libraries; in my experience you usually do not have to bother with those if you do not want to. Static linking works fine for typical uses of assembly language (like rewriting core functions of an inner loop or other hotspot).

The calling convention works in two directions: you can call C from assembly or assembly from C. The latter tends to be a bit easier but there's not a big difference. Calling C from assembly lets you use things like the C standard library output functions, while calling assembly from C is typically how you access an assembly implementation of a single performance-critical function.

System Calls

The other thing your program will do is make system calls. You can write a complete and useful assembly program that never calls external C functions, but if you want to write a pure assembly language program that doesn't outsource the Fun Stuff to someone else's code, you are going to need system calls. And, unfortunately, system calls are totally and completely different on every OS. Unix-style system calls you'll need include (but are most assuredly not limited to!) open, creat, read, write, and the all-important exit, along with mmap if you like allocating memory dynamically.

While every OS is different, most modern OSes follow a general pattern: you load the number of the system call you want into a register, typically EAX in 32-bit code, then load the parameters (how you do that varies wildly), and finally issue an interrupt request: it's INT 2E for Windows NT kernels or INT 80h for Linux 2.x and FreeBSD (and, I believe, OSX). The kernel then takes over, executes the system call, and returns execution to your program. Depending on the OS, it might trash registers or stack as part of the system call; you'll have to make sure you read the system call documentation for your platform to be sure.

SYSENTER

Linux 2.6 kernels (and, I believe, Windows XP and newer, though I have never actually attempted it on Windows) also support a newer, faster method to make a system call: the SYSENTER instruction introduced by Intel in newer Pentium chips. AMD chips have SYSCALL, but few 32-bit OSes use it (though it's the standard for 64-bit, I think; I haven't had to make direct system calls from a 64-bit program so I'm not sure on this). SYSENTER is significantly more complicated to set up and use (see, for example, Linus Torvalds on implementing SYSENTER support for Linux 2.6: "I'm a disgusting pig, and proud of it to boot.") I can personally attest to its peculiarity; I once wrote an assembly function that issued SYSENTER directly to a Linux 2.6 kernel, and I still don't understand the various stack and register tricks that got it to work... but work it did!

SYSENTER is somewhat faster than issuing INT 80h, and so its use is desirable when available. To make it easier to write both fast and portable code, Linux maps a VDSO called linux-gate into the address space of every program; calling a special function in this VDSO will issue a system call by the fastest available mechanism. Unfortunately, using it is generally more trouble than it's worth: INT 80h is so much simpler to do in a small assembly routine that it's worth the small speed penalty. Unless you need ultimate performance... and if you need that, you probably don't want to call into a VDSO anyway, and you know your hardware, so you can just do the horribly unsafe thing and issue SYSENTER yourself.

Everything Else

Other than the demands imposed by interacting with the kernel and other programs, there are very, very few differences between operating systems. Assembly exposes the soul of the machine: you can work as you like, and inside your own code you are not bound by any particular calling convention. You have free access to the FPU and SSE units; you can PREFETCH directly to stream data from memory into the L1 cache and make sure it's hot for when you need it; you can munge the stack at will; you can issue INT 3 if you want to interface with a (properly configured; good luck!) external debugger. None of these things depend on your OS. The only real restriction you have is that you are running at Ring 3, not Ring 0, and so some processor control registers will be unavailable to you. (But if you need those, you're writing OS code, not application code.) Other than that, the machine is laid bare to you: go forth and compute!

kquinn