views:

295

answers:

6

For about a year I have been thinking about writing a program that writes programs. This would primarily be a playful exercise that might teach me some new concepts. My inspiration came from negentropy and the ability for order to emerge from chaos and new chaos to arise out of order in infinite succession.

To be more specific, the program would start by writing a short random string. If the string compiles the programs will log it for later comparison. If the string does not compile the program will try to rewrite it until it does compile. As more strings (mini 'useless' programs) are logged they can be parsed for similarities and used to generate a grammar. This grammar can then be drawn on to write more strings that have a higher probability of compilation than purely random strings.

This is obviously more than a little silly, but I thought it would be fun to try and grow a program like this. And as a byproduct I get a bunch of unique programs that I can visualize and call art.

I'll probably write this in Ruby due to its simple syntax and dynamic compilation and then I will visualize in processing using ruby-processing.

What I would like to know is:

  • Is there a name for this type of programming?
  • What currently exists in this field?
  • Who are the primary contributors?
  • BONUS! - In what ways can I procedurally assign value to output programs beyond compiles(y/n)?
    I may want to extend the functionality of this program to generate a program based on parameters, but I want the program to define those parameters through running the programs that compile and assigning meaning to the programs output. This question is probably more involved than reasonable for a bonus, but if you can think of a simple way to get something like this done in less than 23 lines or one hyperlink, please toss it into your response.

I know that this is not quite meta-programming and from the little I know of AI and generative algorithms they are usually more goal oriented than what I am thinking. What would be optimal is a program that continually rewrites and improves itself so I don't have to ^_^

+4  A: 

Look up "genetic programming."

Edit to respond to comments:

@chris, @Kasturi: true. What was described in the OP is a system for inferring a grammar by brute-force attempts to make some concrete syntax compile, and then back-generating new concrete syntax from the grammar. If you're bound to have something very closely matching that description... the best advice I have is to look into building a hidden Markov model from concrete syntax in some language with very minimal syntax. I'd consider using a minimal combinatorial logic (something akin in spirit to the Unlambda language).

On the other hand, genetic programming is a technique with some developed practice and literature, and is not "deterministic" but rather a stochastic process. It's also a pretty broad term---arguably the system of the OP is a limiting case of GP with 0% crossover and 100% mutation.

Derrick Turk
Genetic programming is something different!!!!
Kasturi
I have done that and It seems to be more deterministic than what I want. I'm looking for an inherently noisy system like your brain http://www.youtube.com/watch?v=mC7Q-ix_0Po
smothers
@chris, @Kasturi: see edit!
Derrick Turk
@Derrick, thanks for clarifying. Your brief explanation of genetic programming as well as your suggestions for writing the OP program helped me think through what I want to code. I already wrote the iterator and it has written a few million strings and saved the ones that compile. It is tedious true, but the code was simple and I think your idea of adding a hidden Markov model will help make some statistical sense out of all the semi-random output. As for GP integration that will probably have to happen now too.
smothers
+1  A: 

A different project you could do is work on something that mutates from not passing a certain unit test to passing a unit test.

For example, given an implementation

def add(a,b)
  a
end

You could add a test

assert_equal 3, Foo.new.add(1,2)

and ask your program to try any combination of methods on a within add (for example, a.-(b), a.to_s, a.+(b)) until one of the mutants passes that test and existing tests.

You may want to look at heckle (or Zentest?) for examples of changing code being tested.

Andrew Grimm
@Andrew Grimm, I like the idea of using unit tests to determine fitness. But setting one myself seems too limited. Sure it will pass eventually if the program iterates every possibility and the test is passable, but that lacks pizazz. Maybe if the test was less discreet, say online - in an environment like SO - where users could view the output and the source and vote on its value. I don't know, I probably don't understand how you would go about mutating the program.
smothers
+3  A: 

Have you heard of nanopond? The concept is similar to yours. Every pixel is given a randomly generated string that is run through a compiler. Usually, this doesn't produce any valid program. However, every once in a while, a randomly generated string will somehow be formatted just perfectly to be able to reproduce itself into a neighboring pixel. Soon, it becomes a battle for which program can reproduce better than the other.

What you are talking about is a genetic algorithm, yes, but maximizing one thing and one thing alone: The ability to reproduce.

This is the essential origin to all naturally-occurring negentropic phenomenon: a randomly formed complex entity has the ability to reproduce.

Classical genetic algorithms impose artificial reproduction criteria -- the program that does a job the best is artificially chosen to reproduce.

What you are implying is a sort of computational natural selection. That is, programs will evolve based on their ability to reproduce themselves, and nothing more.

Will this create something useful? Perhaps not. Unless you, maybe, gave your programs access to the internet or some other external API that they can randomly access, and maybe spread over the internet.

Unfortunately, your described system still has somewhat artificial reproduction criteria: the ability to compile.

Because ability to compile = ability to reproduce, you have artificially set your programs to evolve towards compiling.

Compiling what? It doesn't matter, because any program that compiles is as likely to reproduce as the last. Let's say you somehow generated a program that would output the Fibonacci Sequence. Or a program that could beat a chess Grandmaster. Cool! Unfortunately, would it be reproduced? Would it be special?

Not at all; it'd be treated as "fit" for reproduction as the program print('k')

I suggest perhaps operating a framework of randomly-running strings of programs that have access to API's that can:

  • Read/write to hard drive, and all of a sudden, you have programs that can write random strings as programs, themselves.
  • Delete/modify other files on the hard drive; this allows maybe programs to compete with each other. Your API could be designed so that a file could only be deleted if the "strength" (arbitrary value...perhaps character length) of the program is stronger than the file.
  • Run other scripts on the hard drive...perhaps even ones that they write themselves
  • Access to the internet; to a web server? The ability to write/attach/send/read e-mails?

I think a program that writes programs that can reproduce themselves could produce better results than a program which writes programs that can compile.

Unless you only want the latter; then maybe you could maximize program length. But the possibility of something interesting happening? Not that much; any program that "compiles" with a certain length would be just as likely as "reproducing".

Justin L.
@Justin L., thanks for the great response. At the moment I need my program to understand the language it is writing code in, if only superficially. That is the reason for basing fitness on compilation. Once the program sufficiently deduces the grammatical structure of the programming language, hopefully before my recursive iterator algo overflows its stack, then can I implement your ideas. I like your ideas of letting it read/write to disk and access external api's, but I'm also hesitant to let a bunch of random programs have free rein, you know, in case they learn to spam.
smothers
Don't forget your option to close off a sector of your disk as a sandbox :P You don't want them to go crazy on your computer.But spamming on the internet doesn't help them reproduce, really; therefore I'm not sure it'll ever happen. And if it does, natural selection would have them die right away.
Justin L.
A: 

What would be optimal is a program that continually rewrites and improves itself so I don't have to

Do these steps:

  1. Write pseudo random number generator in assembly. (real mode)
  2. Modify program so it writes out(for example) 64k of random numbers and does a FAR JMP to the first byte.
  3. (optional) Create a driver for a watch dog timer to prevent infinite loops
  4. Load onto the bootsector of some device.
  5. Get multiple computers. Configure so that if they triple fault they will reboot and boot from the bootsector of your device
  6. Boot up the computers and wait a few decades(or centuries, whatever) for it to produce something useful
  7. Profit!
Earlz
+1  A: 

Look for Grammatical Evolution.

Tarantula
A: 

Early but very interesting works along these lines was Doug Lenat's "AM" (A Mathematician) and Eurisko (an generalization of AM).

Eurisko evolved a set of hueristics by examining how boundary conditions affected solving a problem. It did not generate junk than then try it, rather, it took a weak problem solving method and found cases where it could a much better job, and produced a patched version of the problem solver.

Ira Baxter