In some incarnations of the game, the contestants are "snakes" rather than worms. However, this difference is not significant to the strategy.
Chapter 3, Results, covers the results of the problem solving attempt. Fitness graphs are not applicable in this case, since the selection does not involve a fitness value.
Chapter 4, Discussion, contains thoughts on difficulties encountered and quality of the results.
Chapter 5, Future Work, handles possible future enhancements and possibilities of the system.
Chapter 6, Summary, contains a summary of this report.
Chapter 7, References, gives a listing of resources used during this work.
In the pictures of the game, the worms are blue and green (or dark and light in grayscale environments). The heads are marked by a darker shade of the worms colour. Worms always move in the direction of their head.
We use homologous crossover (90%) and ordinary 2-point string crossover (10%). The mutation rate is 20%. The population size is 1000 individuals.
The input to the individual includes the coordinates of the opponent's head given relative to its own head. In some cases, the coordinates of the opponent's first curve was given as input, in order to give the individual a sense of what direction the opponent was moving in.
Fig 2.1: Relative coordinates moving upwards.
Fig 2.2: Relative coordinates moving right.
Two constants was used (typically 0 and 1). The number of internal registers was 3.
The output is one value which is interpreted in the following way:
|= 0||Move forward|
|< 0||Turn left|
|> 0||Turn right|
The language consists of arithmetic operations, involving inputs, outputs and registers, an if statement, comparing two registers, and the
state function. The following is a summary of the available statements:
[register] = [input or register];
[register] [operator]= [input or register];(where [operator] is +, -, * or /)
[output] = [register];
if ([register1] > [register2])
[register1] = state([register1], [register2]);
state function returns the state (occupied or vacant) of a given square. The square is defined by relative coordinates provided by two register values. The return value is stored in the first of the two registers.
In later experiments, we introduced some static "bricks" to give the worms a more dynamic training environment.
Fig 2.3: Brick pattern. Note, that it is designed to trap worms that can only turn in one direction.
We also helped the worms by not allowing them to run into things if there was a way to avoid it. The point is that it is not necessary for the worms to learn how to avoid obvious mistakes, which allows them to focus more on creating traps and other strategies. We call this addition the post-execution correction procedure.
While the system is running, competing worms are displayed. These worms are controlled by random individuals from the population. The result of the games does not, however, influence the evolutionary process, which is running at full speed in another thread. This visuals are there simply to make it easier to see if the system is making any progress.
In later experiments, we have seen some interesting results. Often there occurs a kind of "arms race" where a new way of trapping the opponent is met by counter traps, which are again countered. One example is the "inner track wins" - "U trap" - "U trap evasion" development. (Fig 3.1-3.5)
Fig 3.1: Inner track will win?
Fig 3.2: Outer track is trapped.
Fig 3.3: Almost same situation.
Fig 3.4: Inner track is trapped.
Fig 3.5: Inner track saw the trap. Hope that outer track knows how to turn left!
We have also seen even more complex behaviour, like being happy with slithering into a two squares wide corridor formed by the opponent and a wall, but turning around to escape when the opponent moves one step closer to the wall in order to create a trap. It seems this kind of behaviour requires a deeper "understanding" of the game and also quite a lot of computation and it is impressive that the individual has managed to fit it into its 256 quite simple instructions. (Fig 3.6 and 3.7)
Fig 3.6: The blue worm feels safe so far, but the green one has a plan...
Fig 3.7: The blue worm sees the trap and turns around.
It becomes even more impressive when we see that the same individual has the ability to move back and forth in order to maximize its chances of escape when trapped. (Fig 3.8 and 3.9)
Fig 3.8: Oh no, is the blue worm trapped now?
Fig 3.9: No, it can see this trap too.
This applet evolves from the beginning every time the applet is started (i.e. when the web page is displayed. Also, there is (probably) no way to get printouts from the system when it is run in an applet.
The individual that is the opponent in this instance is just an example of a fairly successful individual. We cannot guarantee that this individual is the best one we have when it comes to competing against human players. It is not able to perform the tricks mentioned above in all instances, but it knows them and displays them sometimes.
A small population (like our 1000 individuals) can produce results faster, since the "generations" progress faster than in a larger population. Hence, we feel that a smaller population is preferable as long as problems with "inbreeding" and lack of diversity can be avoided.
These decisions seems to be justified judging from the results we have achieved.
We tried several ways of getting the individuals to go from the first simple obstacle avoidance strategies to more complex behaviour. In the end it was not punishment that succeeded, but rather a bit of help with the most basic tasks enabled the individuals to concentrate on the more advanced ones.
When a human player is playing against a genetivally generated individual, it is obvious that most of the computer's shortcommings are due to a lack of overview of the board. It is diffucult for the individual computer player to see which areas are more cramped and therefore may be less advantageous to be in. A human player can use her superior pattern recognition abilities to make such decisions very efficiently.
statefunction, be given access to a function finding out how many squares are freely connected to a given square. This would enable the worm to be better at seeing if it may be slithering into a trap.
We have seen quite complex behaviour and it is not an elementary task beating the best individuals at the game.
The way in which this has been achieved is freeing the genetic individuals from the most obvious and simple tasks. The conclusion is that, to teach individuals more complex behaviour, you should not force them to also learn how to deal with the simple things (that you can simply implement yourself).
Nokia Games, http://www.nokia.com/games/snake1.html