views:

253

answers:

3

First of all: This is not a question about how to make a program play Five in a Row. Been there, done that.

Introductory explanation

I have made a five-in-a-row-game as a framework to experiment with genetically improving AI (ouch, that sounds awfully pretentious). As with most turn-based games the best move is decided by assigning a score to every possible move, and then playing the move with the highest score. The function for assigning a score to a move (a square) goes something like this:

  1. If the square already has a token, the score is 0 since it would be illegal to place a new token in the square.

  2. Each square can be a part of up to 20 different winning rows (5 horizontal, 5 vertical, 10 diagonal). The score of the square is the sum of the score of each of these rows.

  3. The score of a row depends on the number of friendly and enemy tokens already in the row. Examples:

    • A row with four friendly tokens should have infinite score, because if you place a token there you win the game.
    • The score for a row with four enemy tokens should be very high, since if you don't put a token there, the opponent will win on his next turn.
    • A row with both friendly and enemy tokens will score 0, since this row can never be part of a winning row.

Given this algorithm, I have declared a type called TBrain:

type
  TBrain = array[cFriendly..cEnemy , 0..4] of integer;

The values in the array indicates the score of a row with either N friendly tokens and 0 enemy tokens, or 0 friendly tokens and N enemy tokens. If there are 5 tokens in a row there's no score since the row is full.

It's actually quite easy to decide which values should be in the array. Brain[0,4] (four friendly tokens) should be "infinite", let's call that 1.000.000. vBrain[1,4] should be very high, but not so high that the brain would prefer blocking several enemy wins rather than wining itself

Concider the following (improbable) board:

  0123456789
 +----------
0|1...1...12
1|.1..1..1.2
2|..1.1.1..2
3|...111...2
4|1111.1111.
5|...111....
6|..1.1.1...
7|.1..1..1..
8|1...1...1.

Player 2 should place his token in (9,4), winning the game, not in (4,4) even though he would then block 8 potential winning rows for player 1. Ergo, vBrain[1,4] should be (vBrain[0,4]/8)-1. Working like this we can find optimal values for the "brain", but again, this is not what I'm interested in. I want an algorithm to find the best values.

I have implemented this framework so that it's totally deterministic. There's no random values added to the scores, and if several squares have the same score the top-left will be chosen.

Actual problem

That's it for the introduction, now to the interesting part (for me, at least)

I have two "brains", vBrain1 and vBrain2. How should I iteratively make these better? I Imagine something like this:

  1. Initialize vBrain1 and vBrain2 with random values.
  2. Simulate a game between them.
  3. Assign the values from the winner to the loser, then randomly change one of them slightly.

This doesn't seem work. The brains don't get any smarter. Why?

Should the score-method add some small random values to the result, so that two games between the same two brains would be different? How much should the values change for each iteration? How should the "brains" be initialized? With constant values? With random values?

Also, does this have anything to do with AI or genetic algorithms at all?

PS: The question has nothing to do with Five in a Row. That's just something I chose because I can declare a very simple "Brain" to experiment on.

+3  A: 

Generally speaking, yes you can make a brain smarter by using genetic algorithms techniques.

Randomness, or mutation, plays a significant part on genetic programming.

I like this tutorial, Genetic Algorithms: Cool Name & Damn Simple.
(It uses Python for the examples but it's not difficult to understand them)

Nick D
+5  A: 

If you want to approach this problem like a genetic algorithm, you will need an entire population of "brains". Then evaluate them against each other, either every combination or use a tournament style. Then select the top X% of the population and use those as the parents of the next generation, where offspring are created via mutation (which you have) or genetic crossover (e.g., swap rows or columns between two "brains").

Also, if you do not see any evolutionary progress, you may need more than just win/loss, but come up with some kind of point system so that you can rank the entire population more effectively, which makes selection easier.

miko
+1  A: 

Take a look at Neuro Evolution of Augmenting Tologies (NEAT). A fancy acronymn which basically means the evolution of neural nets - both their structure (topology) and connection weights. I wrote a .Net implementation called SharpNEAT that you may wish to look at. SharpNEAT V1 also has a Tic-Tac-Toe experiment.

http://sharpneat.sourceforge.net/

locster
I was going to post this, but you seem to have beaten me to it. Ken Stanley's work here is great. See also: http://www.cs.ucf.edu/~kstanley/ and http://nerogame.org/
Jeremy Powell