Feedforward Example

Started by
3 comments, last by chadjohnson 18 years, 7 months ago
Just in case anybody wants an example, here's a program I made that demonstrates learning with a feed-forward artificial neural network. I made it so it learns how to approximate the Sigmoid function. It took about 5 hours on my machine (533 MHz) to get to the following point: Network input: -4.07636 Target output: 0.016686 Network output: 0.0167144 ------------------------------------------------------------------------------- Network input: 1.25095 Target output: 0.777465 Network output: 0.777365 ------------------------------------------------------------------------------- Network input: -9.91089 Target output: 4.9629e-005 Network output: 5.00513e-005 ------------------------------------------------------------------------------- Network input: 0.108951 Target output: 0.527211 Network output: 0.527191 ------------------------------------------------------------------------------- Network input: 1.14475 Target output: 0.758551 Network output: 0.758465 ------------------------------------------------------------------------------- Network input: 7.13248 Target output: 0.999202 Network output: 0.999198 ------------------------------------------------------------------------------- Network input: -8.26655 Target output: 0.000256905 Network output: 0.000258403 It's close, but not perfect (what do you think?). Maybe with more training and other adjustments it could be more accurate. If anybody has any suggestions, please let me know. Question: does anyone know how I could make it approximate ANY mathematical function? I don't know how I could since the values for the output neurons are run through the transfer function which limits the value between 0 and 1.

/******************************************************************************
 *
 * Copyright 2005 Chad Johnson (chad.d.johnson@gmail.com)
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 *
 *****************************************************************************
 *
 * This program demonstrates a feed-forward artificial neural network that
 * learns how to approximate a mathematical function given input values and
 * target output values.
 *
 *****************************************************************************/

/*

(1) Create network
(2) Create layers
(3) Generate a training vector
(4) Set a bias value
(5) Set input value(s) to network
(6) Set target value(s) to network
(7) Propagate
(8) Backpropagate
(9) Repeat steps 5-8 until network reaches desired state

*/

#include <iostream>
#include <cmath>
#include "neuralnetwork.h"

using namespace std;

int main(int argc, char **argv)
{
    // Create a neural network instance
    NeuralNetwork *nn = new NeuralNetwork();
    vector<double> trainingVector;
    int i = 0;
    int j = 0;

    // Generate some random training data
    for (i=0; i<200; i++)
        trainingVector.push_back(RandomDouble(-10.0, 10.0));

    // Create an input layer, one hidden layer, and an output layer
    nn->AddLayer(2);
    nn->AddLayer(30);
    nn->AddLayer(1);

    // Set a bias value
    nn->SetInputValue(0, 1);

    // Training
    while (j < 200000000)
    {
        // Set the input value for the network
        nn->SetInputValue(1, trainingVector);

        // Set the target value for the final output of the network
        nn->SetTargetValue(0, 1 / (1 + exp(-1 * trainingVector)));

        nn->Propagate();
        nn->BackPropagate();

        // Show the final output of the network
        if (i % 30 == 0)
        {
            cout << "Network input:\t" << nn->GetInputValue(1) << endl;
            cout << "Target output:\t" << nn->GetTargetValue(0) << endl;
            cout << "Network output:\t" << nn->GetOutputValue(0) << endl;
            cout << "-------------------------------------------------------------------------------" << endl;
        }

        i++;
        j++;

        // Reset the counter to 0 if it's reached the end of the training data
        if (i > trainingVector.size())
            i = 0;
    }

    return 0;
}



#ifndef NEURALNETWORK
#define NEURALNETWORK

#include <iostream>
#include <vector>
#include <cmath>
#include <ctime>
#include <stdlib.h>

using namespace std;

// Class prototypes
class NeuralNetwork;
class NeuronLayer;
class Neuron;

// function prototypes
double RandomDouble(const double min, const double max);

class NeuralNetwork
{
 private:
    vector<NeuronLayer*> m_layers;
    double m_outerLearningRate;
    double m_innerLearningRate;

 public:
    NeuralNetwork();
    void AddLayer(int neurons);
    void AddNeuron(const int layer);
    int GetLayerCount() const;
    void SetLearningRates(const double inner, const double outer);
    double GetInputValue(int neuron);
    void SetInputValue(int neuron, double input);
    void SetInputVector(vector<double> inputVector);
    vector<double> GetInputVector();
    double GetTargetValue(int neuron);
    vector<double> GetTargetVector();
    void SetTargetValue(int neuron, double target);
    void SetTargetVector(vector<double> targetVector);
    double GetOutputValue(int neuron);
    vector<double> GetOutputVector();
    double GetNeuronOutput(const int layer, const int neuron) const;
    void Propagate();
    void BackPropagate();
    double TransferFunction(const double value) const;
};

class NeuronLayer
{
 private:
    vector<Neuron*> m_neurons;

 public:
    NeuronLayer();
    void AddNeuron();
    Neuron *GetNeuron(const int neuron);
    int GetNeuronCount();
};

// Holds weight values between neuron layers
class Neuron
{
 private:
    double m_value;
    double m_deltaValue;
    double m_targetValue;
    vector<double> m_connections;

 public:
    Neuron();
    double GetValue();
    void SetValue(const double value);
    double GetDeltaValue();
    void SetDeltaValue(const double value);
    double GetTargetValue();
    void SetTargetValue(double targetValue);
    void AddConnection(const double weight);
    double GetConnectionWeight(const int neuron);
    void SetConnectionWeight(const int neuron, const double weight);
};

/*****************************************************************************/
// Generic functions
/*****************************************************************************/
// Generates a random number given a minimum and maximum
double RandomDouble(const double min, const double max)
{
    static int init = 0;

    // Only seed the generator if it has not already been seeded
    if (init == 0)
    {
        srand((unsigned int)time(NULL));
        init = 1;
    }

    return (max - min) * (double)rand() / (double)RAND_MAX + min;
}

/*****************************************************************************/
// NeuralNetwork class functions
/*****************************************************************************/
// Constructor
NeuralNetwork::NeuralNetwork()
{
    // Give the network a default learning rate
    SetLearningRates(0.2, 0.15);
}

// Adds a layer to the network by adding another element to the layer vector
void NeuralNetwork::AddLayer(int neurons = 0)
{
    int i = 0;

    m_layers.push_back(new NeuronLayer());

    // Add the the number of neurons specified in the constructor to this layer
    for (i=0; i<neurons; i++)
        AddNeuron(GetLayerCount()-1);
}

// Adds a neuron to a given layer
void NeuralNetwork::AddNeuron(const int layer)
{
    int i = 0;

    // Add a neuron to this layer
    m_layers[layer]->AddNeuron();

    // Add connections from all neurons in the previous layer to the this
    // neuron if this is not the first layer
    if (layer > 0)
    {
        for (i=0; i<m_layers[layer-1]->GetNeuronCount(); i++)
            m_layers[layer-1]->GetNeuron(i)->AddConnection(RandomDouble(-0.01, 0.01));
    }
}

int NeuralNetwork::GetLayerCount() const
{
    return m_layers.size();
}

// Sets the learning rate for the neural network
void NeuralNetwork::SetLearningRates(const double inner, const double outer)
{
    m_outerLearningRate = inner;
    m_innerLearningRate = outer;
}

// Returns the input value for a given input neuron
double NeuralNetwork::GetInputValue(int neuron)
{
    return m_layers[0]->GetNeuron(neuron)->GetValue();
}

// Sets the input value for a given input neuron
void NeuralNetwork::SetInputValue(int neuron, double input)
{
    m_layers[0]->GetNeuron(neuron)->SetValue(input);
}

// Sets the values for the input neurons
void NeuralNetwork::SetInputVector(vector<double> inputVector)
{
    int i = 0;

    for (i=0; i<inputVector.size(); i++)
        m_layers[0]->GetNeuron(i)->SetValue(inputVector);
}

// Returns the input vector to the network
vector<double> NeuralNetwork::GetInputVector()
{
    vector<double> temp;
    int i = 0;

    for (i=0; i<m_layers[0]->GetNeuronCount(); i++)
        temp.push_back(m_layers[0]->GetNeuron(i)->GetValue());

    return temp;
}

// Returns the target value for a given output neuron
double NeuralNetwork::GetTargetValue(int neuron)
{
    return m_layers[GetLayerCount()-1]->GetNeuron(neuron)->GetTargetValue();
}

// Sets the target value for a given output neuron
void NeuralNetwork::SetTargetValue(int neuron, double target)
{
    m_layers[GetLayerCount()-1]->GetNeuron(neuron)->SetTargetValue(target);
}

// Sets the target vector for the neural network. Used in backpropagation
void NeuralNetwork::SetTargetVector(vector<double> targetVector)
{
    int i = 0;

    for (i=0; i<targetVector.size(); i++)
        m_layers->GetNeuron(i)->SetTargetValue(targetVector);
}

// Returns the target output value for the network
vector<double> NeuralNetwork::GetTargetVector()
{
    vector<double> temp;
    int i = 0;

    for (i=0; i<m_layers[GetLayerCount()-1]->GetNeuronCount(); i++)
        temp.push_back(m_layers[GetLayerCount()-1]->GetNeuron(i)->GetTargetValue());

    return temp;
}

// Returns the output value for a given output neuron
double NeuralNetwork::GetOutputValue(int neuron)
{
    return m_layers[GetLayerCount()-1]->GetNeuron(neuron)->GetValue();
}

// Returns a vector containing the values of the neurons in the output layer
vector<double> NeuralNetwork::GetOutputVector()
{
    vector<double> temp;
    int i = 0;

    for (i=0; i<m_layers[GetLayerCount()-1]->GetNeuronCount(); i++)
        temp.push_back(m_layers[GetLayerCount()-1]->GetNeuron(i)->GetValue());

    return temp;
}

// Returns the summation of the products of the input value and the weights for
// a given neuron
double NeuralNetwork::GetNeuronOutput(const int layer, const int neuron) const
{
    return m_layers[layer]->GetNeuron(neuron)->GetValue();
}

// Feeds the input values through the network and calculates the output value
// for the network
void NeuralNetwork::Propagate()
{
    int i = 0;
    int j = 0;
    int k = 0;
    double weight = 0;
    double input = 0;
    double newValue = 0;

    // Loop through the layers starting at the second layer (first hidden layer)
    for (i=1; i<GetLayerCount(); i++)
    {
        // Loop through the neurons in the current layer
        for (j=0; j<m_layers->GetNeuronCount(); j++)
        {
            newValue = 0;

            // Loop through the neurons from the previous layer (which connect
            // to the neurons in the current layer
            for (k=0; k<m_layers[i-1]->GetNeuronCount(); k++)
            {
                // get the connection weight from the current neuron in the
                // previous layer to the current neuron in the current layer
                weight = m_layers[i-1]->GetNeuron(k)->GetConnectionWeight(j);

                // get the value for the current neuron in the previous layer
                input = m_layers[i-1]->GetNeuron(k)->GetValue();

                // add the product of the weight and the input to the summation
                newValue += weight * input;
            }

            // Run the new value through the transfer function
            newValue = TransferFunction(newValue);

            // set the value for the current neuron to the sum of the weights
            // and inputs coming into that neuron
            m_layers->GetNeuron(j)->SetValue(newValue);
        }
    }
}

// Adjusts the weights for the connections to improve the network's accuracy
void NeuralNetwork::BackPropagate()
{
    int i = 0;
    int j = 0;
    int k = 0;
    int l = 0;
    double delta = 0;
    double deltaTemp = 0;
    double previousNeuronOutput = 0;
    double currentNeuronOutput = 0;
    double currentConnectionWeight = 0;
    double changeInConnectionWeight = 0;

    // Loop through the layers starting at the output layer and ending at the
    // first hidden layer
    for (i=GetLayerCount()-1; i>=1; i--)
    {
        // Loop through the neurons in the current layer
        for (j=0; j<m_layers->GetNeuronCount(); j++)
        {
            currentNeuronOutput = m_layers->GetNeuron(j)->GetValue();

            // Loop through the neurons from the previous layer (which connect
            // to the neurons in the current layer
            for (k=0; k<m_layers[i-1]->GetNeuronCount(); k++)
            {
                previousNeuronOutput = m_layers[i-1]->GetNeuron(k)->GetValue();

                // Test whether the loop is at the output connection layer. If it's
                // not at the output layer it's at a hidden layer
                if (i == GetLayerCount()-1)
                {
                    delta = currentNeuronOutput * (1 - currentNeuronOutput) * (m_layers->GetNeuron(j)->GetTargetValue() - currentNeuronOutput);

                    // calculate change in weight for output connection layer
                    changeInConnectionWeight = m_outerLearningRate * delta * previousNeuronOutput;
                }
                else
                {
                    deltaTemp = 0;

                    // Get the delta values for all neurons in the next layer
                    for (l=0; l<m_layers[i+1]->GetNeuronCount(); l++)
                        deltaTemp += m_layers[i+1]->GetNeuron(l)->GetDeltaValue();

                    delta = currentNeuronOutput * (1 - currentNeuronOutput) * deltaTemp;

                    // calculate change in weight for hidden connection layer
                    changeInConnectionWeight = m_innerLearningRate * delta * previousNeuronOutput;
                }

                // Get the weight of the connection from the current neuron in
                // the previous layer to the current neuron in the current layer
                currentConnectionWeight = m_layers[i-1]->GetNeuron(k)->GetConnectionWeight(j);

                // Add the change in weight to the current neuron's weight
                m_layers[i-1]->GetNeuron(k)->SetConnectionWeight(j, currentConnectionWeight + changeInConnectionWeight);
            }
        }
    }
}

// Transfer (activation) function using the Sigmoid function
double NeuralNetwork::TransferFunction(const double value) const
{
    return 1 / (1 + exp(-1 * value));
}

/*****************************************************************************/
// NeuronLayer class functions
/*****************************************************************************/
// Constructor
NeuronLayer::NeuronLayer()
{
}

// Adds a neuron to the neuron vector
void NeuronLayer::AddNeuron()
{
    m_neurons.push_back(new Neuron());
}

// Returns a pointer to a given neuron for this layer
Neuron *NeuronLayer::GetNeuron(const int neuron)
{
    return m_neurons[neuron];
}

int NeuronLayer::GetNeuronCount()
{
    return m_neurons.size();
}

/*****************************************************************************/
// Neuron class functions
/*****************************************************************************/
// Constructor
Neuron::Neuron()
{
    // Give the neuron an initial value
    m_value = 0;

    // set the delta value (used in backpropagation) initially to 0
    m_deltaValue = 0;
}

// Returns the output value for the neuron
double Neuron::GetValue()
{
    return m_value;
}

// Sets the output value for the neuron
void Neuron::SetValue(const double value)
{
    m_value = value;
}

// Returns the delta value for the neuron
double Neuron::GetDeltaValue()
{
    return m_deltaValue;
}

// Sets the delta value for the neuron
void Neuron::SetDeltaValue(const double value)
{
    m_deltaValue = value;
}

// Gets the target value for the neuron. Should only be used if the neuron is
// an output neuron
double Neuron::GetTargetValue()
{
    return m_targetValue;
}

// Sets the target value for the neuron. Should only be used if the neuron is
// an output neuron
void Neuron::SetTargetValue(double targetValue)
{
    m_targetValue = targetValue;
}

// Adds a new connection to the connection vector
void Neuron::AddConnection(const double weight)
{
    m_connections.push_back(weight);
}

// Returns the connection weight to another neuron
double Neuron::GetConnectionWeight(const int neuron)
{
    return m_connections[neuron];
}

// Sets the connection weight to another neuron
void Neuron::SetConnectionWeight(const int neuron, const double weight)
{
    m_connections[neuron] = weight;
}

#endif

Advertisement
I think you meant "if(j % 30 == 0)" or better yet "if((j & 0xFFFF) == 0xFFFF)" in your main loop. Use the latter (that is, "if((j & 0xFFFF) == 0xFFFF)") for a huge speed increase.

Either way, this isn't approximating the function as quickly as I'd think it should :P
"I want to make a simple MMORPG first" - Fenryl

I think you meant "if(j % 30 == 0)" or better yet "if((j & 0xFFFF) == 0xFFFF)" in your main loop. Use the latter (that is, "if((j & 0xFFFF) == 0xFFFF)") for a huge speed increase.


Can you please explain how this is faster? This is an area of C/C++ that I am not comfortable with, and I would like to understand a bit better (regarding the 0xFFFF and bitwise &).
Quote:Original post by Anonymous Poster

I think you meant "if(j % 30 == 0)" or better yet "if((j & 0xFFFF) == 0xFFFF)" in your main loop. Use the latter (that is, "if((j & 0xFFFF) == 0xFFFF)") for a huge speed increase.


Can you please explain how this is faster? This is an area of C/C++ that I am not comfortable with, and I would like to understand a bit better (regarding the 0xFFFF and bitwise &).


Well considering the rest of the loop takes quite a bit of time, the (j & 0xFFFF)==0xFFFF will not execute much faster than j % 0xFFFF == 0, so it's really a pointless speed increase :P But basically, it works by saying if all the last 16 bits are 1 (F = 1111, 4 F's = 4 sets of 1111 = 16 bits), then true, otherwise false. The modular way works by saying if j evenly divides by n then true, otherwise false.

The real speed increase is achieved by not outputting the results as often, because cout is actually quite a slow process to perform. It should not be outputting much more than say 10 times per second in this program, and once every few seconds is sufficient, so a lot of time can be saved by not spamming the console.

As for the NN, could someone explain the SetLearningRates() parameters to me and how they affect the NN? Do lower values mean it will reach the goal less quickly but with more stability? Thanks.

edit: also you could do well to put some output formatting in there. pop a #include <iomanip> in there after including iostream and use the following for output:
cout.setf(ios::right);cout.precision(15);cout << "Network input:\t" << setw(20) << fixed << nn->GetInputValue(1) << endl;cout.setf(ios::right);cout << "Target output:\t" << setw(20) << fixed << nn->GetTargetValue(0) << endl;cout.setf(ios::right);cout << "Network output:\t" << setw(20) << fixed << nn->GetOutputValue(0) << endl;cout << "-------------------------------------------------------------------------------" << endl;
"I want to make a simple MMORPG first" - Fenryl
Quote:
I think you meant "if(j % 30 == 0)" or better yet "if((j & 0xFFFF) == 0xFFFF)" in your main loop. Use the latter (that is, "if((j & 0xFFFF) == 0xFFFF)") for a huge speed increase.


I'll look into that.

When you have a low learning rate, I think it makes the network's final output more accurate.

Yea it is a bit slow, but I think I can make it faster. I'm thinking that if it starts out with a really high learning rate and then steadily goes to a lower one, it will learn faster and still be very accurate. I just have to find or come up with the math to do this - any ideas there?

This topic is closed to new replies.

Advertisement