Mixing assembler code with c/c++

A:

Assembly can be very optimal than what any compiler can generate in certain situations.

Sundar 2010-09-04 19:18:41

Though compilers are very often smarter than you.

zneak 2010-09-04 19:32:25

Unless of course you are also very smart

doron 2010-09-04 19:35:54

And even then, you're probably not smart enough.

Joe 2010-09-04 20:51:22

Or the compiler has limitations. Most compilers for the Microchip PIC do a painfully bad job, until you stop and realize how hostile to compilers the architecture is. Then you live with it unless an inner loop is a bottleneck, at which point a skilled assembly author can usually do much better than the compiler.

RBerteig 2010-09-04 22:33:51

always assuming the compiler is smater than you is, IMO, a bad assumption(assuming your not an idiot :P). back in the old(er) days MSVC's compilers used to shove in horrible 64bit math funcs for things that totally didn't need them, one case would be 64bit multiply and add, should boil down to MUL + ADC(on x86), however VC6 would stick in _aulmul instead, so unless you used PP5(for __emulu), you were stuck with 'retarded' math. this doesn't apply so much to big production compilers these days(though MSVC 08 won't genereates memset intrinsics, which means either black magic C or asm are needed)

Necrolis 2010-09-08 06:37:21

+3 A:

In the past, compilers used to be pretty poor at optimizing for a particular architecture, and architectures used to be simpler. Now the reverse is true. These days, it's pretty hard for a human to write better assembly than an optimizing compiler, for deeply-pipelined, branch-predicting processors. And so you won't see it much. What there is will be short, and highly targeted.

In short, you probably won't need to do this. If you think you do, profile your code to make sure you've identified a hotspot - don't optimize something just because it's slow, if you're only spending 0.1% of your execution time there. See if you can improve your design or algorithm. If you don't find any improvement there, or if you need functionality not exposed by your higher-level language, look into hand-coding assembly.

Michael Petrotta 2010-09-04 19:22:09

+2 A:

There may be new instructions that your compiler cannot yet generate, or the compiler does a bad job, or you may need to control the CPU directly.

James 2010-09-04 19:22:55

I wouldn't say you're "directly controlling" the CPU when using assembly code.

zneak 2010-09-04 19:30:46

What about doing something like modifying the EFLAGS register on x86 CPUs?

James 2010-09-04 19:36:13

+1 A:

Why is assembly language code often needed along with C/C++ ?needed along with C/C++ ?

It isn't

What can't be done in C/C++, which is possible when assembly language code is mixed?

Accessing system registers or IO ports on the CPU. Accessing BIOS functions. Using specialized instructions that doesn't map directly to the programming language, e.g. SIMD instructions. Provide optimized code that's better than the compiler produces.

The two first points you usually don't need unless you're writing an operating system, or code running without an operatiing system.

Modern CPUs are quite complex, and you'll be hard pressed to find people that actually can write assembly than what the compiler produces. Many compilers come with libraries giving you access to more advanced features, like SIMD instructions, so nowadays you often don't need to fall back to assembly for that.

nos 2010-09-04 19:27:41

+12 A:

Things that pop to mind, in no particular order:

Special instructions. In an embedded application, I need to invalidate the cache after a DMA transfer has filled the memory buffer. The only way to do that on an SH-4 CPU is to execute a special instruction, so inline assembly (or a free-standing assembly function) is the only way to go.
Optimizations. Once upon a time, it was common for compilers to not know every trick that was possible to do. In some of those cases, it was worth the effort to replace an inner loop with a hand-crafted version. On the kinds of CPUs you find in small embedded systems (think 8051, PIC, and so forth) it can be valuable to push inner loops into assembly. I will emphasize that for modern processors with pipelines, multi-issue execution, extensive caching and more, it is often exceptionally difficult for hand coding to even approach the capabilities of the optimizer.
Interrupt handling. In an embedded application it is often needed to catch system events such as interrupts and exceptions. It is often the case that the first few instructions executed by an interrupt have special responsibilities and the only way to guarantee that the right things happen is to write the outer layer of a handler in assembly. For example, on a ColdFire (or any descendant of the 68000) only the very first instruction is guaranteed to execute. To prevent nested interrupts, that instruction must modify the interrupt priority level to mask out the priority of the current interrupt.
Certain portions of an OS kernel. For example, task switching requires that the execution state (at least most registers including PC and stack pointer) be saved for the current task and the state loaded for the new task. Fiddling with execution state of the CPU is well outside of the feature set of the language, but can be wrapped in a small amount of assembly code in a way that allows the rest of the kernel to be written in C or C++.

Edit: I've touched up the wording about optimization. Let me emphasize that for targets with large user populations and well supported compilers with decent optimization, it is highly unlikely that an assembly coder can beat the performance of the optimizer.

Before attempting, start by careful profiling to determine where the bottlenecks really lie. With that information in hand, examine assumptions and algorithms carefully, because the best optimization of all is usually to find a better way to handle the larger picture. Then, if all else fails, isolate the bottleneck in a test case, benchmark it carefully, and begin tweaking in assembly.

RBerteig 2010-09-04 19:27:50

Point 2 is dangerous. Modern optimizing compilers will know and implement all appropriate optimizations for a platform. For a new platform it may not be there yet but given user requests it will soon catch up and supersede any human optimizations. Therefore if you do this; (using #if and #error) limit your assembly to particular version(s) of the compiler so that when you re-compile with a new compiler you are forced to re-evaluate whether your assembly is better than the current version of the compiler.

Martin York 2010-09-05 18:37:57

@Martin, the majority of cases where I've personally done this were years ago, well before "modern optimizers", or for target platforms without the motivation behind GCC targeting desktops. One has to remember that in the embedded systems world, there is a world market of very nearly zero users for any cross compiler per thousand desktop target users. That said, you make a valid point, and I'll edit in a suggestion to profile first and benchmark after.

RBerteig 2010-09-05 21:30:55

Well, even modern compilers are extremely limited to what optimisations they're allowed to do. That goes from the fact, that they have to always generate working code and are thus very conservative about optimising code. Also, code analysis is still more or less a computationally impossible problem, which also severly limits the compilers optimization abilities as opposed to a programmers ASM code. Most performance-critical code still heavily relies on ASM code because of those facts.

Mavrik 2010-09-08 06:39:45

+3 A:

There are certain things that can only be done in assembler and cannot be done in C/C++.

These include:

generating software interrupts (SWI or INT instructions)
Use of instructions like SWP for creating mutexes
specialist coporcessor instructions (such as those needed to program the MMU and manage RAM caches)
Access to carry and overflow flags.

You may also be able to optimize code better in assembler than C/C++ (eg memcpy on Android is written in assembler)

doron 2010-09-04 19:30:32

+2 A:

Why is assembly language code often needed along with C/C++ ?

Competitive advantage. Like, if you are writing software for the (soon-to-be) #1 gaming company in the world.

What can't be done in C/C++, which is possible when assembly language code is mixed?

Nothing, unless some absolute performance level is needed, say, X frames per second or Y billions of polygons per second.

Edit: based on other replies, it seems the consensus is that embedded systems (iPhone, Android etc) have hardware accelerators that certainly require the use of assembly.

I have some source code of some 3D computer games. There are a lot of assembler code in use.

They are either written in the 80's-90's, or they are used sparingly (maybe 1% - 5% of total source code) inside a game engine.

Edit: to this date, compiler auto-vectorization quality is still poor. So, you may see programs that contain vectorization intrinsics, and since it's not really much different from writing in actual assembly (most intrinsics have one-one mapping to assembly instructions) some folks might just decide to write in assembly.

Update:

According to anecdotal evidence, RollerCoaster Tycoon is written in 99% assembly.
http://www.chrissawyergames.com/faq3.htm

rwong 2010-09-04 19:31:10

+1 A:

One more thing worth mentioning is:

C & C++ do not provide any convenient way to setup stack frames when one needs to implement a binary level interop with a script language - or to implement some kind of support for closures.

Chris Becke 2010-09-05 07:13:48

ansaurus

tags:

views:

answers:

Mixing assembler code with c/c++

related questions