views:

270

answers:

6

The answer to this question is, of course, "write some assembly code!"

But I can do that: I know basic x86 syntax, the types of registers and how to use them, subroutines, etc etc.

When I took an assembly class, each assignment had a maximum number of instructions that we were allowed to use. If we went over, then we would lose points - so this taught us to do things efficiently. You know, efficiently in the way that, if you saw a C program doing such things, you might harp on that individual for creating difficult-to-follow code.

For C, one can suggest "write a multi-threaded program" or "write a socket server/client". For PHP, suggestions might be "write a personal blog software". To learn the syntax of a language, "project euler" is often suggested.

So my question(s) are:

  1. What kinds of things can I write in assembly that will let me hone my assembly and efficiency skills? For example, I could do a project euler problem in x86, but that doesn't help me with assembly-specific skills. What are project ideas I could do that can only really be coded in assembler?

  2. How would I go about profiling whether or not I'm writing "good" code? For example, with C and sockets, there are a billion tutorials on the web, and there are a billion people on my buddy list who can look at my code, and there are a billion SO questions about them: so it is easier to measure one's progress.

  3. Are there existing assembly heavy projects which I can look at to learn from? Or even parts of programs where they break into assembly for [specific task] for the sake of efficiency?

  4. What kinds of books should I be reading? Blogs of people who routinely look at low-level things?

(preferably things I could do on x86 on linux. I figure the concepts will carry over to other architectures.)

+2  A: 

What to write:

Write a simple 3D texture mapped renderer. That's a decent project because it's not super complex but not super trivial. And the fact that you can see the results working is always nice.

Aaron
+1 - graphics is always nice for the eyes and still a niche where you can make noticable difference with assembler. Getting texture mapping done isn't trivial as well. Doable of course, but still quite a task.
Nils Pipenbrinck
+3  A: 

One approach might be to pick functions from the standard C library (e.g. string functions, mem*()) to implement from scratch in assembly. Build a benchmarking harness to measure the performance of your code, see if you can equal or better the performance of the libraries provided on your system. I wouldn't consider it cheating to disassemble the system library code (after you've taken a whack at it), often useful techniques can be found by inspecting disassembled code. Reading other folks code is highly recommended (perhaps starting with the bits of assembly codde found in the Linux kernel sources.)

Lance Richardson
+6  A: 

Good answers. I would also suggest writing a small compiler and having it write the assembly language for you. That way, you get to think about different ways to do things in assembler, like passing arguments, making stack frames, composing address expressions, array indexing, managing memory, conditionals, loops, try-catch, etc. etc.

Mike Dunlavey
This seems like the best answer to me. Let's face it: there are only 3 reasons to program in assembly these days. 1) you can't obtain a compiler for your platform. 2) you need to write a compiler. 3) you need to optimize code that a compiler just isn't getting. This is the only suggestion that can help with 2 and 3, and there really isn't anything that can help with 1, and it's fairly uncommon.
San Jacinto
okay, just thinking that maybe 1 is more common than i gave it credit for... but the goal usually isn't to leave such specialized hardware without proper tools. usually not using a compiler at all here means that one is being invented as we speak.
San Jacinto
@San: Thx. I used to do a lot of assembler for various platforms, but the only compiler around was Fortran. Besides, when it came to doing graphics, especially on an 8088, you need to get really good at counting cycles.
Mike Dunlavey
@San: 4) You need access to instructions which no high or medium level language expose. For example memory fetches with simulated MMU access, access to arithmetic flags, access to FIQ/IRQ flags, access to the co-processor interface, cache interface, etc etc. Most of them are needed when writing firmware or device drivers.
Mads Elvheim
@Mads point taken. I had lumped situations like that under 1 for no good reason.
San Jacinto
A: 

If jumping in with both feet suits you, consider improving the inner loops of a distributed mass computing BOINC project—like SETI@Home. (Other projects here.)

On my computers, each SETI@Home work unit needs hours to crunch, and is almost perfectly CPU bound. Typically, C/C++ compilers aren't excellent at arranging parallel floating point and integer operations, particular for each CPU type. It would be especially useful to optimize, say the x86 instructions to optimize for the particular capabilities of the CPU it actually runs on: SSE SSE2, 80586, 80686, Athlon, etc. Those still running ten-plus-year old hardware would appreciate such optimization, and modern hardware will no doubt benefit to a great degree as well.

wallyk
+1  A: 

First, how do one define "good assembly code"? The fastest code, the most ABI compliant code, or the code which is easiest to read? I think "good" depends on the overall goal, and you didn't really tell us what you want to use assembly for.

Others recommended writing a software rasterizer. I could latch onto that, but since you already know the x86 mnemonics, you don't really need to write more assembly code. What you need is more insight in how machines work under the hood.

My suggestion is to write a system-wide or user-space emulator. I wrote a system-emulator for ARM920, and learned a ton, without writing a single assembly mnemonic! Okay, it ended up dead slow, but I wrote it as an interpreter in pure C. I now know most of the ARM architecture's dark secrets and it gave me a new perspective on how embedded computers work.

Just remember, peripherals can be complex to emulate. There is nothing wrong with emulating the CPU, but adding simplified psuedo-peripherals. If you're good, you could even make a plug&play system for them.

You might want to check out the QEMU and DosBox sources to get some good ideas, though they use a JIT scheme. My interpreter is found here gp2xemu. It was an attempt at an emulator for the GP2x, but I got stuck due to sucky documentation.

Mads Elvheim
+3  A: 

Assembly can do many things that C cannot, and the optimizer is not magic. That said, most useful things that don't require you to be an assembly semi-deity fall into the compiler, standard libraries and interpreter runtime categories.

Trampolines, for example are or might be useful in all three of them, and you just cannot use C to compose an arbitrary call stack.

To write better assembly, read the manuals here:

http://www.agner.org/optimize/

To see programs written exclusively in assembly and a community obsessed with optimization to benchmark you:

http://flatassembler.net/

jbcreix
I was not aware of either of those resources; thank you!
rascher