String representation of ANN

Started by
7 comments, last by willh 12 years, 10 months ago
Hi folks,

Am interested in applying genetic algorithms to artificial neural networks. In the past I've done this by hand coding various mutation and crossover operators (i.e. AddRandomNode(), AddRandomConnection()) quite successfully, but the framework is hardly portable.


What I would like to do is use the 'standard' GA method of bitstring encoding and apply it to an Artificial Neural network. Rather than reinvent the wheel I was wondering if any of the GameDev folks are familiar with a decent string representation of the ANN.


My only real constraints are as follows:

- The number of inputs must be predefined

- The number of outputs must be predefined.

- The network is feed forward.


In terms of architecture, the only constraint is that it's literally feed-forward.

i.e. Node 7 can never connect to node 6, but can connect to node 9 and 10. Node 9 can never connect to 7 but can connect to node 10.



Comments? Suggestions? Is this even worthwhile or am I better off sticking with hand-coded operators?
Advertisement
Are you saying that you want to represent a neural network as a string so that you can use a genetic algorithm to derive a good network to represent a function?

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.8699&rep=rep1&type=pdf

That paper describes what you're investigating, I believe. It contains information about a bit representation of the network.

Are you saying that you want to represent a neural network as a string so that you can use a genetic algorithm to derive a good network to represent a function?

http://citeseerx.ist...p=rep1&type=pdf

That paper describes what you're investigating, I believe. It contains information about a bit representation of the network.


Thanks Brain in a Vat. :) Yes, that's exactly what I want to do. I'd like to use bit strings (or something similar) to encode the network architecture and weights.

That paper uses a fixed topology network and the bit string only represents connection weights. It's neccesary in their case because they use backprop postprocessing and want to maintain explicit layers. It's an option. :)
I think you'll have trouble trying to encode non-fixed network topologies. I think the problem is that the information you're trying to encode is somewhat complex (given a non-fixed network topology), and the length of any given bit-string representation will necessarily depend on the size of the network. If you add nodes, that has to increase the length of the string. The usual genetic operators don't change the length of the string representation. You could devise some genetic operators that could add nodes, but you'd have to be careful that you don't produce malformed networks. And then how do you perform a crossover between two NN string representations of different lengths?

It's an interesting problem, I'd suggest you do some searches for published papers and get yourself up to speed on what researchers have tried in this area. I don't think this is the best place for help with this problem. I'm certainly not the best person, at least, sorry :)
Have you investigated NEAT? http://www.cs.ucf.edu/~kstanley/neat.html

Ken Stanley's work is very good and there is a ton of information available through his web page. He also has C++, Python, and Java libraries available for download, in case you simply want to experiment with applications.

-Kirk

Have you investigated NEAT? http://www.cs.ucf.ed...anley/neat.html

Ken Stanley's work is very good and there is a ton of information available through his web page. He also has C++, Python, and Java libraries available for download, in case you simply want to experiment with applications.

-Kirk


Yes, I've seen NEAT. He's implemented a lot of very interesting ideas. His encoding scheme is pretty much identical (minus the innovation part) to what I had already done; he's hand coded mutation operators for insert, remove, change, etc...


I found a paper from the mid 90's that went over a bunch of different encoding schemes. The direction I'm on now is relatively simple but flexible enough for a non-constrained feedforward network.

Basically it's a binary string, with two bit op-codes, followed by a 30 bit parameter. There are only two operands (NewNode and ConnectNode). NewNode ignores the parameter, while ConnectNode user the first 15 bits of the parameter as a connection offset (it's always positive) and the last 15 bits as the connection weight (+/- 3.something). Any connection offset that's > the length of the network gets mod'd so it hits an output node.

Anyway, if anyone has any experience with this please feel free to chime in. Thanks for the suggestions guys. :)
Thought I'd share my results.

Below is a screenshot of a test app I wrote. Basically it's a neural network that was evolved to approximate the function:
[font="Consolas"][font="Consolas"]y = Sin( (x + 8) / 13)
[/font][/font]
The graph has two curves on it; the blue is the sin function, and the red is the ANNs approximation. Input is X, output is Y.

This isn't technically a feedforward network anymore, as a bug allowed a node to feed back in to itself-- evolution didn't care about my bug and decided to exploit it. :)

These results were achieved after about 150 generations with a population size of 100. Run time was a couple of seconds. The population was seeded with random bitstrings of variable length. I included crossover (actually 4 different types of crossover; single point, two point, cut-and-splice, and uniform) and mutation (flip bit, add bit, remove bit). All crossover and mutation was done on the binary encoding directly. The code never maintains a list of decoded neural nets.


The textbox showing 0's and 1's is the binary encoding for the best network. The textbox below it shows what the binary decodes to in terms of syntax. If anyone cares, here is what the text means:

ID:n = new node, B= node bias, L= lamda for nodes activation function (steepness of the sigmoid).

C:n = connect current node to the node N steps forward, W= weight of the connection.


I'm a little surprised that this worked so well.


nn_sin.png
I just wanted to add that what you're creating is essentially grammatical evolution with a very specific domain of available functions.

I just wanted to add that what you're creating is essentially grammatical evolution with a very specific domain of available functions.



Thanks npatrick04.

After playing around with it a bit, it seems to be VERY poor once the problem gets even a tiny bit more complicated. For a two dimensional function z = f(x,y) it wants to latch on to optimizing just one of the dimensions. I've tried running it over and over with different starting parameters and it runs in to the same problem each time. After a few hundred thousand generations (!!!!) it eventually starts incorporating the second input parameter in to the solution.. I even tried using a monte-carlo method for selecting which members of the population compete in each round, hoping to increase solution diversity, but that didn't seem to help either.


I guess this is my long-winded way of saying that it looks like using bitstrings isn't nearly as effective as directly manipulating the target solutions using some domain specific knowledge. Heck, I was able to 'evolve' a decent face detector in less time than it took evolve a bitstring solution to a simple 2D function.

This topic is closed to new replies.

Advertisement