views:

304

answers:

16

How can programming in assembly help in achieving optimization

+7  A: 

These days, you have to be very good at assembly to beat the compiler.

I can do it any day of the week, but only by viewing the compiler's output first.

And then, if it gains more than a couple of percentage points I'd be surprised.

These days, I only program in assembly when I'm doing something the compiler can't do.

Joshua
+1  A: 

Typically you wouldn't program in assembly. You would program in C, and then look at the generated assembly to see what optimzations (or not) the C compiler made automatically. Adjusting your C code (to allow for better vectorization for example) will allow the compiler to re-arrange code better, which will give you optimized assembly

Derek
+2  A: 

Programming in assembly won't, in and of itself, optimize your code. The main thing about assembly is that it allows you to have very low-level access and to choose exactly what instructions the processor executes.

Since you won't have some compiler generating the assembly for you, you can perform code optimizations when you write the program yourself, if you know how.

burningstar4
+2  A: 

So, you think you are smarter than gcc optimizing compiler? If not, then fughed aboud it (learning assembly for the sake of getting better at optimization). That would be akin to learning Scheme language for the sake of getting better at recursion :)

Hamish Grubijan
I disagree with the analogy. Scheme forces you to deal with recursion (if you're using standard idioms), but assembly doesn't force you to optimize. At the same time, I agree that the choice of assembly to learn optimization is quite arbitrary.
Justin K
@Justin K - ok. Although ... one can still abuse recursion in Scheme in a way that will make beginner programmers think that it is the worse invention ever since non-sliced bread. Do not believe me - just ask fellow SO users (for instance) to write the most obscure function which calculates factorial using recursion.
Hamish Grubijan
+10  A: 

The most likely way programming in assembly can improve your code is by improving you: teaching you more about what is happening at a low level and getting the discipline of optimization can help you make good decisions in higher-level languages.

As far as actually helping one program: as others have noted it's rarely worth it. It's just possible you can use it as a kind of advanced profile-driven optimization: try many variations until you find one that's best on your particular problem.

To start with this: write a program in C or C++ or whatever compiled language you normally use, fire up your debugger, and disassemble a small but nontrivial function, and have a think about why the compiler did what it did. Then try writing a small bit of inline assembler yourself. On modern systems assembly is mostly easily embedded within C rather than done from scratch.

Or alternatively, get a teeny machine like a PIC and make it flash a LED...

poolie
This is correctest. Understanding how the compiler targets the architecture is almost pointless in terms of trying to outrace the compiler. But understanding what happens in the assembly and why kind of changes your thinking about the code such that it's not just about the HLL itself, but is more targeted to your platform.
Chris
Exactly. I think Alan Perlis was playing on this with his quote "LISP programmers know the value of everything and the cost of nothing." - no matter how high level your language is, in the end it all boild down to CPU instructions. To make an exaggerated example: If you don't know a thing about the von Neuman architecture, you'll never understand that reading from a file will be way slower than keeping stuff in memory. The same pattern in many other cases. At the end of the day, high level languages have the same performance aspects as assembly.
delnan
A: 

More likely than being able to beat the compiler at writing assembly code. Knowing how typical tasks translate to assembly may help you write better high level language code.

Typically you do not resort to assembly for optimiziation purposes. If this is possible, usually someone already will have provided the essential code ready for you to call, for example in form of a linear algebra library.

Likewise assembly offers direct access to the processor (e.g. for atomicity, time measurement, I/O) but the important accesses will already have have been made accessible for your high level language.

Peter G.
+1  A: 

There used to be a very good book about this subject, called Inner Loops by Rick Booth. I still think it's a very valuable book, in that it can raise your awareness of what to look out for when optimizing assembler code (categorization of instructions as very fast, fast, slow; when two instructions can execute in parallel; memory alignment, cache misses, stalls, penalties, etc.) However, the book only covered Intel processors up to the Pentium Pro / Pentium MMX, and with the newer hardware architectures that are available today, the book is now fairly out-of-date.

This is exactly the problem of optimizing assembly language: You need to know very well the architecture which you're targeting; contrast this with an optimizing compiler (e.g. for the C language) that can target different platforms and will apply optimizations accordingly. Much knowledge and work has gone into the optimization stage of compilers, and you will have to study a particular architecture quite a bit before you can beat a good compiler.

stakx
Perhaps this is the point: You could write much better compilers ... not in assembly though
belisarius
+5  A: 

In principle, you can write highly-optimized code in assembly because the compiler is limited to specific, general-purpose optimizations that should apply to many programs, while you can be creative and use your knowledge of this particular program.

To take a simple example, back when I was new to this business compilers were very limited in their ability to optimize register usage. You know that to perform any sort of arithmetic or logical operation, the CPU must generally load one of the values into a register, then perform the operation on the other, then save the result? Like to add two numbers together -- and I'll use a pseudo-assembler here because I don't know what assembly languages you know and I've forgotten most of the details myself -- you'd write something like this:

LOAD A,value1
ADD A,value2
STORE a,destination

Compilers used to generate the loads for every operation. So if your C program said:

 x=x+y;
 z=z+x;

The compiler would generate something like:

LOAD A,x
ADD A,y
STORE A,x
LOAD A,z
ADD A,x
STORE A,z

But a human could observe that by the time we get to the second statement, register A already contains x, and addition is commutative, so we could optimize this to:

LOAD A,x
ADD A,y
STORE A,x
ADD A,z
STORE A,z

Et cetera. One could go through all sorts of tiny micro-optimizations like this. I used to do that all the time back when I was young and the world was green.

But over the years compilers have gotten much smarter, and CPUs have gotten more powerful so the micro-optimizations don't matter as much.

Thus, I haven't written any assembly language code in, wow, probably 15 years. I used to read the assembly generated by the compiler when debugging, sometimes it would give a clue to a subtle problem, but I haven't done that in years now either.

I don't think compilers are even written in assembly any more. Instead, you write the first draft of the compiler in a high level language on some other computer, i.e. you write a cross-compiler to get yourself off the ground.

I suspect the only real use of assembly today is for extremely constrained environments, embedded systems and that sort of thing; and for programs that have to deal intimately with the hardware, like device drivers.

I'd be interested to hear if there are any assembly programmers on this forum who care to tell us why they assembly programmers.

Jay
Oh, but let me add: I think it is very valuable to learn assembly so that you understand how the computer really works inside. Even if you never write assembly for a real project, knowing how it works is very valuable. I've seen plenty of programs where the programmer did things that were very inefficient or subject to subtle errors, and it's pretty obvious that the source of the problem is that the programmer doesn't understand what's really happening inside the box.
Jay
A: 

Compilers do a good job of generating assembler.

However, there's a bad reason why hand-written assembler is faster. Since it's harder to write, you write less of it.

It would be nice if programmers could discipline themselves to get the same job done in minimal code, regardless of language.

Mike Dunlavey
That's why I like Python - it encourages minimal code, even to the point of including batteries for all sorts of constructs.
Wayne Werner
A: 

When writing assembly, or even just straight raw bytes the assembler outputs, you can write programs that use computer hardware specific features or makes something otherwise very carefully specified.

There might be really high benefits if your program does the optimized part far more often than it does anything else. Always set up benchmarks before attempting optimizations.

The downcome is that your hand-written assembly works on fewer different hardware. It may even end up getting limited into the hardware model and revision!

It's rare you ever can or need to write assembly routines because commonly written software must work on almost every hardware you find and your kitten.

There's one interesting application if you know assembly. You can then write programs that produce assembly routines. Though it's mostly only fun unless you keep it really small so you can port it easily.

Cheery
+1  A: 

About the only time I can think of using Assembly language for optimizing code is when you need something very specific, like you need a GPIO on a microcontroller to toggle between high and low exactly every 9 clock cycles. that's too short a time to manage with an interrupt, and higher level language compilers don't normally offer this kind of control over the instruction stream.

TokenMacGuy
A: 

In general, the compiler will do a fairly good job at generating optimal code. There are, however, cases where writing your own assembly can result in even more optimized (in terms of space and/or speed) code.

Typically, this happens when there is something that you know about the target system that the compiler doesn't. Compilers are designed to work on a variety of systems; if you want to take advantage of something unique to your target system, sometimes you have to go in and do it yourself. Here's an example. A few months ago, I was writing some code for a MIPS-based embedded system. There are many different types of MIPS CPUs, and some support certain opcodes that others do not. My compiler would generate MIPS code using the set of assembly operations that all MIPS architectures support. However, I knew that my chip could do more. I had a subroutine that needed to count the number of leading zeroes in a 32-bit number. The compiler synthesized this into a loop that took about 10 lines of assembly to do. I re-wrote it in one line by using the CLZ opcode that was designed to do just this. I knew that my chip supported the opcode but the compiler didn't. Admittedly, situations like this aren't very common; when they do pop up, however, it's nice to have enough of a background in assembly to take advantage of them.

bta
+2  A: 

Sometimes one will need to perform a task which maps particularly well onto some CPU instructions, but does not fit well into any high-level-language constructs. For example, on many processors one may easily perform extended-precision arithmetic using something like:

  add  r0,r4
  addc r1,r5
  addc r2,r6
  addc r3,r7

This will regard r3:r2:r1:r0 and r7:r6:r5:r4 as numbers four words long, adding the second to the first. Four nice easy instructions, any anyone who understands assembly would know what they do. I know of no way to perform the same task in C without it not only generating bigger and slower object code, but also being an incomprehensible mess of source code.

A somewhat more extreme but specialized real-world example: Given two arrays array1[0..63] and array2[0..63], compute array1[0]*array2[0] + array1[1]*array2[1] + array1[2]*array2[2] ... + array1[63]*array2[63]. On a DSP I used, the computation could be done in machine code in about 75 machine cycles (about 67 of which are a repeating MAC instruction). There's no way C code could come anywhere close.

supercat
A: 

In most modern applications, it can't to any significant degree.

Inter-Process Communication Affects Application Response Time explains why algorithms are unlikely to be bottlenecks. (But always profile - never guess.)

In general, programming in assembly will increase time-to-market, bug density, and maintenance costs. Instead, strive for simplicity and readability in your code.

As poolie mentioned, the main benefit of learning assembly today is a deeper understanding of software and hardware. From that perspective, there's quite a bit of information on Steve Gibson's site.

TrueWill
A: 

If you understood why there is sometimes the need to do asm, you would appreciate the strengths, costs (headaches for you).

mP