views:

1112

answers:

9

I am a firm believer in the idea that one of the most important things you get from learning a new language is not how to use a new language, but the knowledge of concepts that you get from it. I am not asking how important or useful you think Assembly is, nor do I care if I never use it in any of my real projects.

What I want to know is what concepts of Assembly do you think are most important for any general programmer to know? It doesn't have to be directly related to Assembly - it can also be something that you feel the typical programmer who spends all their time in higher-level languages would not understand or takes for granted, such as the CPU cache.

+2  A: 

Memory, registers, jumps, loops, shifts and the various operations one can perform in assembler. I don't miss the days of debugging my assembly language class programs - they were painful! - but it certainly gave me a good foundation.

We forget (or never knew, perhaps) that all this fancy-pants stuff that we use today (and that I love!) boils down to all this stuff in the end.

Now, we can certainly have a productive and lucrative career without knowing assembler, but I think these concepts are good to know.

itsmatt
A: 

Nowadays, asm is not a direct line to the guts of the CPU, but more of an API. The assembler opcodes you write are themselves are compiled into a completely different instruction-set, rearranged, rewritten, fixed-up and generally mangled beyond recognition.

So it's not like learning assembler gives you a fundamental insight into what's going on inside the CPU. IMHO, more important than learning assembler is to get a good understanding of how the target CPU and the memory hierarchy works.

This series of articles covers the latter topic pretty thoroughly.

I'm not sure what you're talking about here--assembly is most definitely a direct line to the guts of the CPU. Every single line I write and send into nasm/yasm is directly translated into an opcode--sure there's macros and other useful features, but pabsw xmm0, xmm0 can only mean one thing.
Dark Shikari
x86 opcodes get turned into uops (often several for a single opcode) by the decoder. The x86 registers are renamed to hardware registers (of which there are many many more). The uops are reordered, sometimes glued together again. Go read up on it, it's a whole other world in there...
When Intel and AMD created x86-64, they made the processor more RISC-like, and rely on internal instruction translation for backward compatability with the CISC instruction set.
Max Lybbert
+4  A: 

It's good to know assembly language in order to gain a better appreciation for how the computer works "under the hood," and it helps when you are debugging something and all the debugger can give you is an assembly code listing, which at least gives you fighting chance of figuring out what the problem might be. However, trying to apply low-level knowledge to high-level programming languages, such as trying to take advantage of how the CPU caches instructions and then writing wonky high-level code to force the compiler to produce super-efficient machine code, is probably a sign that you are trying to micro-optimize. In most cases, it's usually better not to try to outsmart the compiler, unless you need the performance gain, in which case, you might as well write those bits in assembly anyway.

So, it's good to know assembly for the sake of better understanding of how things work, but the knowledge gained is not necessarily directly applicable to how you write code in high-level languages. On that note, however, I found that learning how function calls work at the assembly-code level (learning about the stack and related registers, learning about how parameters are passed on the stack, learning how automatic storage works, etc.) made it a lot easier to understand problems I had in higher-level code, such as "out of stack space" errors and "invalid calling convention" errors.

Mike Spross
A: 

I would say that addressing modes are extremely important.

My alma mater took that to an extreme, and because x86 didn't have enough of them, we studied everything on a simulator of PDP11 that must have had at least 7 of them that I remember. In retrospect, that was a good choice.

Uri
+1  A: 

The most important concept is SIMD, and creative use of it. Proper use of SIMD can give enormous performance benefits in a massive variety of applications ranging from everything from string processing to video manipulation to matrix math. This is where you can get over 10x performance boosts over pure C code--this is why assembly is still useful beyond mere debugging.

Some examples from the project I work on (all numbers are clock cycle counts on a Core 2):

Inverse 8x8 H.264 DCT (frequency transform):

c: 1332
mmx: 187
sse2: 127

8x8 Chroma motion compensation (bilinear interpolation filter):

c: 639
mmx: 144
sse2: 110
ssse3: 79

4 16x16 Sum of Absolute Difference operations (motion search):

c: 3948
mmx: 278
sse2: 231
ssse3: 215

(yes, that's right--over 18x faster than C!)

Mean squared error of a 16x16 block:

c: 1013
mmx: 193
sse2: 131

Variance of a 16x16 block:

c: 783
mmx: 171
sse2: 106
Dark Shikari
+1  A: 

I would say that learning recursion and loops in assembly has taught me alot. It made me understand the underlying concept of how the compiler/interpreter of the language i'm using pushes things onto a stack, and pops them off as it needs them. I also learned how to exploit the infamous stack overflow. (which is still surprisingly easy in C with some get- and put- commands).

Other than using asm in every-day situations, i don't think that i would use any of the concepts assembly taught me.

contagious
A: 

timing

fast execution:

  • parallel processing
  • simple instructions
  • lookup tables
  • branch prediction, pipelining

fast to slow access to storage:

  • registers
  • cache, and various levels of cache
  • memory heap and stack
  • virtual memory
  • external I/O
Mark Stock
+6  A: 

Register allocation and management

Assembly gives you a very good idea of how many variables (machine-word-sized integers) the CPU can juggle simultaneously. If you can break your loops down so that they involve only a few temporary variables, they'll all fit in registers. If not, your loop will run slowly as things get swapped out to memory.

This has really helped me with my C coding. I try to make all loops tight and simple, with as little spaghetti as possible.

x86 is dumb

Learning several assembly languages has made me realize how lame the x86 instruction set is. Variable-length instructions? Hard-to-predict timing? Non-orthogonal addressing modes? Ugh.

The world would be better if we all ran MIPS, I think, or even ARM or PowerPC :-) Or rather, if Intel/AMD took their semiconductor expertise and used it to make multi-core, ultra-fast, ultra-cheap MIPS processors instead of x86 processors with all of those redeeming qualities.

Dan
+2  A: 

I think assembly language can teach you lots of little things, as well as a few big concepts.

I'll list a few things I can think of here, but there is no substitute for going and learning and using both x86 and a RISC instruction set.

You probably think that integer operations are fastest. If you want to find an integer square root of an integer (i.e. floor(sqrt(i))) it's best to use an integer-only approximation routine, right?

Nah. The math coprocessor (on x86 that is) has a fsqrt instruction. Converting to float, taking the square root, and converting to int again is faster than an all-integers algorithm.

Then there are things like accessing memory that you can follow, but not properly apprecatiate, until you've delved into assembly. Say you had a linked list, and the first element in the list contains a variable that you will need to access frequently. The list is reordered rarely. Well, each time you need to access that variable, you need to load the pointer to the first element in the list, then using that, load the variable (assuming you can't keep the address of the variable in a register between uses). If you instead stored the variable outside of the list, you only need a single load operation.

Of course saving a couple of cycles here and there is usually not important these days. But if you plan on writing code that needs to be fast, this kind of knowledge can be applied both with inline assembly and generally in other languages.

How about calling conventions? (Some assemblers take care of this for you - Real Programmers don't use those.) Does the caller or callee clean up the stack? Do you even use the stack? You can pass values in registers - but due to the funny x86 instruction set, it's better to pass certain things in certain registers. And which registers will be preserved? One thing C compilers can't really optimise by themselves is calls.

There are little tricks like PUSHing a return address and then JMPing into a procedure; when the procedure returns it will go to the PUSHed address. This departure from the usual way of thinking about function calls is another one of those "states of enlightenment". If you were ever to design a programming language with innovative features, you ought to know about funny things that the hardware is capable of.

A knowledge of assembly language teaches you architecture-specific things about computer security. How you might exploit buffer overflows, or break into kernel mode, and how to prevent such attacks.

Then there's the ubercoolness of self-modifying code, and as a related issue, mechanisms for things such as relocations and applying patches to code (this needs investigation of machine code as well).

But all these things need the right sort of mind. If you're the sort of person who can put

while(x--)
{
  ...
}

to good use once you learn what it does, but would find it difficult to work out what it does by yourself, then assembly language is probably a waste of your time.

Artelius