views:

2312

answers:

13

Is there any real use for self modifying code?

I know that they can be used to build worms/viruses, but I was wondering whether there is some good reason that a programmer may have to use self modifying code.

Any ideas? Hypothetical situations are welcome too.

+3  A: 

Dynamic linking is a kind of self-modification (patching absolute and/or relative jump locations) ... that's normally done by the O/S's program loader, though.

ChrisW
+18  A: 

Turns out that the Wikipedia entry on "self-modifying code" has a great list:

  1. Semi-automatic optimization of a state dependent loop.
  2. Runtime code generation, or specialization of an algorithm in runtime or loadtime (which is popular, for example, in the domain of real-time graphics) such as a general sort utility preparing code to perform the key comparison described in a specific invocation.
  3. Altering of inlined state of an object, or simulating the high-level construction of closures.
  4. Patching of subroutine address calling, as done usually at load time of dynamic libraries, or, on each invocation patching the subroutine's internal references to its parameters so as to use their actual addresses. Whether this is regarded as 'self-modifying code' or not is a case of terminology.
  5. Evolutionary computing systems such as genetic programming.
  6. Hiding of code to prevent reverse engineering, as through use of a disassembler or debugger.
  7. Hiding of code to evade detection by virus/spyware scanning software and the like.
  8. Filling 100% of memory (in some architectures) with a rolling pattern of repeating opcodes, to erase all programs and data, or to burn-in hardware.
  9. Compression of code to be decompressed and executed at runtime, e.g., when memory or disk space is limited.
  10. Some very limited instruction sets leave no option but to use self-modifying code to achieve certain functionality. For example, a "One Instruction Set Computer" machine that uses only the subtract-and-branch-if-negative "instruction" cannot do an indirect copy (something like the equivalent of "*a = **b" in the C programming language) without using self-modifying code.
  11. Altering instructions for fault-tolerance

On the point about thwarting hackers using self-modifying code:

Over the course of several firmware updates, DirectTV slowly assembled a program on their smart card to destroy cards that have been hacked to illegally receive unpaid channels. See Jeff's Coding Horror article on the Black Sunday Hack for more information.

Zach Scrivena
DirectTV's Black Sunday Hack?
Brian
That's it! Thanks!
Zach Scrivena
Thanks for that Zach!!!
Niyaz
+8  A: 

I've seen self-modifying code used for:

  1. speed optimisation, by having the program write more code for itself on the fly

  2. obsfucation, to make reverse engineering much harder

Alnitak
Historically this was quite popular for copy-protection mechanisms on game software.
ConcernedOfTunbridgeWells
indeed - that's exactly where I've seen it :)
Alnitak
which btw was required on some old 8-bit micro (BBC) games to get them to run from disk instead of cassette tape.
Alnitak
A: 

Neural networks are kind of self-modifying code.

Then there are evolutionary algorithms which modify themselves.

Georg
I am not sure neural networks modify the code. I never knew that. http://www.hoozi.com/Articles/Neural-Networks-Artificial-Neuron.htm
Niyaz
I believe any change that must be done to the structure of a neural network can be done in the data part. Why should it modify the code?
Niyaz
neural nets are _not_ self modifying code. they're nothing more than complex non-linear transformations whose weights are determined by training.
Alnitak
They don't actually modify the code itself, but the function changes. The same is true for evolutionary algorithms, they can be implemented without changing the actual code.
Georg
What about evolutionary algorithms that are implemented on top of code generators to perform their evolution?
Erik Forbes
It depends, both are possible.
Georg
+5  A: 

In former times where RAM was limited, self modifying code was used to save memory. Nowadays for example application compression utilities like UPX are used to decompress/modify the own code after loading a compressed image of the application.

Kosi2801
I thought these binary compressors only compressed on disk, and decompressed when loaded into memory?I also read once that because they are decompressed as loaded into memory they cannot be paged out to disk, so they consume more RAM. Isn't this the case?
Peter Morris
Packed executables have a "bootstrap" application which is loaded into memory and started there. This then loads the compressed data, decompresses it and appends the decompressed instructions to its own code. When decompression is finished, this code is started. Paging happens as usual.
Kosi2801
Self decompressing JavaScript is used abundantly on web pages.
Jader Dias
A: 

Application switch implement their own scripting languages often do this. For example, database servers often compile stored procedures (or queries) this way.

Craig Stuntz
A: 

Mike Abrash described the Pixomatic code generator for Dr. Dobb's Journal a while back: http://www.ddj.com/architect/184405807 . That's a software 3d dx7(?) compatible rasterizer.

MSN
+1  A: 

LOL - i've written self-modifying code on two occasions:

  1. when first learning assembly language, before i understood indirect indexed access
  2. accidentally, as pointer bugs in assembly language and C

i can imagine that there may be scenarios where self-modifying code would be more efficient than alternatives, but nothing obvious leaps to mind. In general, this is something to avoid - debugging nightmare, etc. - unless you are deliberately trying to obfuscate as mentioned above.

Steven A. Lowe
+3  A: 

Because the Commodore 64 doesn't have many registers and has a 1Mhz processor. When you need to read a memory address offset by a value it is easier to modify the source.

@Reader:
LDA $C000
STA $D020
INC Reader+1
JMP Reader

That's the last time I wrote self-modifying code anyway :-)

Peter Morris
+1  A: 

Lots of reasons. Off the top of my head:

  • Runtime class construction and meta programming. For example, having a class factory that takes a connection to an SQL table and generates a client class specialized for that table (with accessors for the columns, find methods, etc.).

  • Then of course there's the famous bitblt example, and the regexp analogs.

  • Dynamically optimizing based on RT information a la tracing JITs

  • Subtype specialization of ada style generic functions in an accretive environment.

-- MarkusQ

MarkusQ
+1  A: 

Artificial Intelligence?

Al Katawazi
A: 

Dynamic code generation in SwiftShader is a form of self modifying code that enables it to efficiently implement Direct3D 9 on the CPU.

A: 

Because it's really really cool, and sometimes that's reason enough.

Bruce McGee