I'm going through MSIL and noticing there are a lot of nop instructions. The MSDN article says they take no action and are used to fill space if the opcode is patched. They're used a lot more in debug builds than release builds. I know that these kinds of statements are used in assembly languages to make sure an opcode fits on a word boundary, but why is it needed in MSIL?
It provides an opportunity for line-based markers (e.g. breakpoints) in the code where a release build would emit none.
One classic use for them is so that your debugger can always associate a source-code line with an IL instruction.
Dude! No-op is awesome! It is an instruction that does nothing but consume time. In the dim dark ages you would use it to do microadjustments in timing in critical loops or more importantly as a filler in self-modifying code.
They allow the linker to replace a longer instruction (typically long jump) with a shorter one (short jump). The NOP takes the extra space - the code could not be moved around as it would stop other jumps from working. This happens at link-time, so the compiler can't know whether a long or short jump would be appropriate.
At least, that's one of their traditional uses.
They could be using them to support edit-and-continue while debugging. It provides the debugger with room to work to replace the old code with new without changing offsets, etc.
This is not an answer to your specific question, but back in the old days you could use a NOP to fill a branch delay slot, if you couldn't manage to fill it with an otherwise-useful instruction.
Do the .NET compilers align the MSIL output? I'd imagine it might be useful for speeding up access to the IL... Also, my understanding is that it's designed to be portable and aligned accesses are required on some other hardware platforms.
In the software cracking scene, a classic method to unlock an application would be to patch with a NOP the line that checks for the key or registration or time period or whatnot so it would do nothing and simply continue starting the application as if it is registered.
NOPs serve several purposes:
- They allow the debugger to place a breakpoint on a line even if it is combined with others in the generated code.
- It allows the loader to patch a jump with a different-sized target offset.
- It allows a block of code to be aligned at a particular boundary, which can be good for caching.
- It allows for incremental linking to overwrite chunks of code with a call to a new section without having to worry about the overall function changing size.
I've also seen NOPs in code that modifies itself to obfuscate what it does as a placeholder (veeery old copy protection).
The first assembly I learned was SPARC so I'm familiar with the branch delay slot, if you can't fill it with another instruction, usually the instruction you were going to put above the branch instruction or increment a counter in loops, you use a NOP.
I'm not familiar with cracking, but I think is common to overwrite the stack using NOP so you have not to exactly calculate where your malicious function begins.
It may also make code run faster, when optimizing for specific processors or architectures:
Processors for a long time employ multiple pipelines that work roughly in parallel, so two independent instruction can be exceuted at the same time. On a simple processor with two pipelines, the first may support all instructions, whereas the second supports only a subset. Also, there are some stalls between the pipelines when one has to wait for the result of a previous instruction that isn't finished yet.
Under these circumstances, a dedicated nop may force the next instruction into a specific pipeline (the first, or not the first), and improve the pairing of following instructions so that the cost of the nop is more than amortized.
As ddaa said, nops let you account for variance in the stack, so that when you overwrite the return address it jumps to the nop sled (a lot of nops in a row) and then hits the executable code correctly, rather than jumping to some byte in the instruction that isn't the beginning.
Here's how nops are used by debugging:
Nops are used by language compilers (C#, VB, etc.) to define implicit sequence points. These tell the JIT compiler where to ensure machine instructions can be mapped back to IL instructions.
Rick Byer's blog entry on DebuggingModes.IgnoreSymbolStoreSequencePoints, explains a few of the details.
C# also places Nops after call instructions so that the return site location in source is the call out rather than the line after the call.
A somewhat unorthodox use are NOP-Slides, used in buffer overflow exploits.
50 years too late but hey.
Nop's are useful if you are typing assembly code by hand. If you had to remove code, you could nop the old opcodes.
similary, you could insert new code by overwriting some opcode and jump somewhere else. There you put the overwritten opcodes, and insert your new code. When ready you jump back.
Sometimes you had to use the tools which were available. In some cases this was just a very basic machinecode editor.
Nowadays with compilers the techniques make no sense whatsoever anymore.
In one processor I worked for recently (for four years) NOP was used to make sure the previous operation finished before the next operation was started. For instance:
load value to register (takes 8 cycles) nop 8 add 1 to register
This made sure register had the correct value before the add operation.
Another use was to fill in execution units, such as the interrupt vectors which had to be a certain size (32 bytes) because address for vector0 was, say 0, for vector 1 0x20 and so on, so the compiler put NOPs in there if needed.