Jump to content
  • Advertisement
Sign in to follow this  
chadjohnson

Neural network not learning

This topic is 4835 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, I've recently been studying feed-forward neural networks and I made a program to learn how to approximate a given mathematical function. Right now it only takes one input and produces one output (will do more later on). I set the network up with three layers - the first (input layer) having 1 neuron, the hidden layer having 500 neurons, and the last (output layer) having 1 neuron. I'm trying to get my network to learn how to approximate the sigmoid function. I've had the program running for about a day now (piping to an output text file), giving it random input values and target output values (by solving the equation to get what that value should be). It is able to learn how to approximate the function usually after about two or three runs and backpropagations, but it never nails it on the first try. So in short, my network is not really "learning," and I don't understand why. However, it's never off by more than 1, usually about .998 at the most (is that "good enough"?). When it goes to a new input value and starts another training session, the first output is always very close to, if not the same, as the last value outputed in the last training session. Can someone give me some insight (I might possibly have an error somewhere, but I don't think so)? Here's my code. I would attach it, but there's no option to do so when posting.
// neuralnetwork.cpp

#include <iostream>
#include <cmath>
#include "neuralnetwork.h"

using namespace std;

int main(int argc, char **argv)
{
    // Create a neural network instance
    NeuralNetwork *nn = new NeuralNetwork();

    nn->AddLayer(1);
    nn->AddLayer(500);
    nn->AddLayer(1);

    // Training. Go through the training data array
    while (true)
    {
        // Set the input value for the network
        nn->SetInputValue(RandomDouble(-10.0, 10.0));

        // Set the target value for the final output of the network
        nn->SetTargetValue(1 / (1 + exp(-1 * nn->GetInputValue())));

        cout << "Network input:\t\t" << nn->GetInputValue() << endl;
        cout << "Target output:\t\t" << nn->GetTargetValue() << endl;
        cout << endl;

        // Train the network. This makes it "learn" how to approximate the function
        nn->Train();

        // Show the final output of the network
        cout << endl;
        cout << "Network input:\t\t" << nn->GetInputValue() << endl;
        cout << "Network output:\t\t" << nn->GetValue() << endl;
        cout << "Target output:\t\t" << nn->GetTargetValue() << endl;
        cout << "Training sessions:\t" << nn->GetTrainingCount() << endl;
        cout << "-------------------------------------------------------------------------------" << endl;
    }

    return 0;
}


// neuralnetwork.h

#ifndef NEURALNETWORK
#define NEURALNETWORK

#include <iostream>
#include <vector>
#include <cmath>
#include <ctime>
#include <stdlib.h>

using namespace std;

// Constants
const double MAX_TRAINING = 500000;

// Class prototypes
class NeuralNetwork;
class NeuronLayer;
class Neuron;

// function prototypes
double RandomDouble(double min, double max);

class NeuralNetwork
{
 private:
    vector<NeuronLayer*> m_layers;
    double m_outerLearningRate;
    double m_innerLearningRate;
    double m_input;
    double m_target;
    double m_value;
    int m_trainingCount;

 public:
    NeuralNetwork();
    void AddLayer(int neurons);
    void AddNeuron(int layer);
    int GetLayerCount();
    void SetLearningRates(double inner, double outer);
    void SetInputValue(double input);
    double GetInputValue();
    void SetTargetValue(double target);
    double GetTargetValue();
    double GetValue();
    void SetValue(double value);
    double GetNeuronOutput(int layer, int neuron);
    void Run();
    void BackPropagate();
    void Train();
    double TransferFunction(double value);
    int GetTrainingCount();
    void SetTrainingCount(int count);
};

class NeuronLayer
{
 private:
    vector<Neuron*> m_neurons;

 public:
    NeuronLayer();
    void AddNeuron();
    Neuron *GetNeuron(int neuron);
    int GetNeuronCount();
};

// Holds weight values between neuron layers
class Neuron
{
 private:
    double m_value;
    double m_deltaValue;
    vector<double> m_connections;

 public:
    Neuron();
    double GetValue();
    void SetValue(double value);
    double GetDeltaValue();
    void SetDeltaValue(double value);
    void AddConnection(double weight);
    double GetConnectionWeight(int neuron);
    void SetConnectionWeight(int neuron, double weight);
};

/*****************************************************************************/
// Generic functions
/*****************************************************************************/
// Generates a random number given a minimum and maximum
double RandomDouble(double min, double max)
{
    static int init = 0;

    // Only seed the generator if it has not already been seeded
    if (init == 0)
    {
        srand((unsigned int)time(NULL));
        init = 1;
    }

    return (max - min) * (double)rand() / (double)RAND_MAX + min;
}

/*****************************************************************************/
// NeuralNetwork class functions
/*****************************************************************************/
// Constructor
NeuralNetwork::NeuralNetwork()
{
    // Give the network a default learning rate
    SetLearningRates(0.2, 0.15);

    // Give the network an initial target value
    SetTargetValue(0);

    // Give the network an initial value
    SetValue(0);
}

// Adds a layer to the network by adding another element to the layer vector
void NeuralNetwork::AddLayer(int neurons = 0)
{
    int i = 0;

    m_layers.push_back(new NeuronLayer());

    // Add the the number of neurons specified in the constructor to this layer
    for (i=0; i<neurons; i++)
        AddNeuron(GetLayerCount()-1);
}

// Adds a neuron to a given layer
void NeuralNetwork::AddNeuron(int layer)
{
    int i = 0;

    // Add a neuron to this layer
    m_layers[layer]->AddNeuron();

    // Add connections from all neurons in the previous layer to the this
    // neuron if this is not the first layer
    if (layer > 0)
    {
        for (i=0; i<m_layers[layer-1]->GetNeuronCount(); i++)
            m_layers[layer-1]->GetNeuron(i)->AddConnection(RandomDouble(-0.01, 0.01));
    }
}

int NeuralNetwork::GetLayerCount()
{
    return m_layers.size();
}

// Sets the learning rate for the neural network
void NeuralNetwork::SetLearningRates(double inner, double outer)
{
    m_outerLearningRate = inner;
    m_innerLearningRate = outer;
}

// Sets the input value for the network
void NeuralNetwork::SetInputValue(double input)
{
    m_input = input;
    m_layers[0]->GetNeuron(0)->SetValue(input);
}

// Returns the input value to the network
double NeuralNetwork::GetInputValue()
{
    return m_input;
}

// Sets the target (desired) value for the neural network. Used in error
// calculation
void NeuralNetwork::SetTargetValue(double target)
{
    m_target = target;
}

// Returns the target output value for the network
double NeuralNetwork::GetTargetValue()
{
    return m_target;
}

// Returns the calculated output value for the network
double NeuralNetwork::GetValue()
{
    return m_value;
}

// Sets the output value for the network
void NeuralNetwork::SetValue(double value)
{
    m_value = value;
}

// Returns the summation of the products of the input value and the weights for
// a given neuron
double NeuralNetwork::GetNeuronOutput(int layer, int neuron)
{
    return m_layers[layer]->GetNeuron(neuron)->GetValue();
}

// Feeds the input values through the network and calculates the output value
// for the network
void NeuralNetwork::Run()
{
    int i = 0;
    int j = 0;
    int k = 0;
    double weight = 0;
    double input = 0;
    double newValue = 0;

    // Loop through the layers
    for (i=0; i<GetLayerCount(); i++)
    {
        // Loop through the neurons in the current layer
        for (j=0; j<m_layers->GetNeuronCount(); j++)
        {
            newValue = 0;

            if (i > 0)
            {
                // Loop through the neurons from the previous layer (which connect
                // to the neurons in the current layer
                for (k=0; k<m_layers[i-1]->GetNeuronCount(); k++)
                {
                    // get the connection weight from the current neuron in the
                    // previous layer to the current neuron in the current layer
                    weight = m_layers[i-1]->GetNeuron(k)->GetConnectionWeight(j);

                    // get the value for the current neuron in the previous layer
                    input = m_layers[i-1]->GetNeuron(k)->GetValue();

                    // add the product of the weight and the input to the summation
                    newValue += weight * input;
                }
            }
            else
                newValue = m_layers->GetNeuron(j)->GetValue();

            // Run the new value through the transfer function
            newValue = TransferFunction(newValue);

            // set the value for the current neuron to the sum of the weights
            // and inputs coming into that neuron
            m_layers->GetNeuron(j)->SetValue(newValue);
        }
    }

    // Set the output value for the network to the output value for the neuron
    // in the output layer
    SetValue(m_layers[GetLayerCount()-1]->GetNeuron(0)->GetValue());
}

// Adjusts the weights for the connections to improve the network's accuracy
void NeuralNetwork::BackPropagate()
{
    int i = 0;
    int j = 0;
    int k = 0;
    int l = 0;
    double delta = 0;
    double deltaTemp = 0;
    double previousNeuronOutput = 0;
    double currentNeuronOutput = 0;
    double currentConnectionWeight = 0;
    double changeInConnectionWeight = 0;

    // Loop through the layers starting at the output layer
    for (i=GetLayerCount()-1; i>=1; i--)
    {
        // Loop through the neurons in the current layer
        for (j=0; j<m_layers->GetNeuronCount(); j++)
        {
            // Loop through the neurons from the previous layer (which connect
            // to the neurons in the current layer
            for (k=0; k<m_layers[i-1]->GetNeuronCount(); k++)
            {
                currentNeuronOutput = m_layers->GetNeuron(j)->GetValue();
                previousNeuronOutput = m_layers[i-1]->GetNeuron(k)->GetValue();

                // Test whether the loop is at the output connection layer. If it's
                // not at the output layer it's at a hidden layer
                if (i == GetLayerCount()-1)
                {
                    delta = currentNeuronOutput * (1 - currentNeuronOutput) * (GetTargetValue() - currentNeuronOutput);

                    // calculate change in weight for output connection layer
                    changeInConnectionWeight = m_outerLearningRate * delta * previousNeuronOutput;
                }
                else
                {
                    deltaTemp = 0;

                    // Get the delta values for all neurons in the next layer
                    for (l=0; l<m_layers[i+1]->GetNeuronCount(); l++)
                        deltaTemp += m_layers[i+1]->GetNeuron(l)->GetDeltaValue();

                    delta = currentNeuronOutput * (1 - currentNeuronOutput) * deltaTemp;

                    // calculate change in weight for hidden connection layer
                    changeInConnectionWeight = m_innerLearningRate * delta * previousNeuronOutput;
                }

                // Get the weight of the connection from the current neuron in
                // the previous layer to the current neuron in the current layer
                currentConnectionWeight = m_layers[i-1]->GetNeuron(k)->GetConnectionWeight(j);

                // Add the change in weight to the current neuron's weight
                m_layers[i-1]->GetNeuron(k)->SetConnectionWeight(j, currentConnectionWeight + changeInConnectionWeight);
            }
        }
    }
}

// "Trains" the network by backpropgating a given number of times
void NeuralNetwork::Train()
{
    int i = 0;

    // Continue training until the network's output equals the target value or
    // the maximum number of training sessions allowed has been reached
    while (i < MAX_TRAINING)
    {
        Run();
        BackPropagate();

        if (i % 10000 == 0)
            cout << GetValue() << " - " << GetTargetValue() << " = " << abs(GetValue() - GetTargetValue()) << endl;

        i++;
    }

    cout << GetValue() << " - " << GetTargetValue() << " = " << abs(GetValue() - GetTargetValue()) << endl;

    SetTrainingCount(i);
}

// Transfer (activation) function using the Sigmoid function
double NeuralNetwork::TransferFunction(double value)
{
    return 1 / (1 + exp(-1 * value));
}

int NeuralNetwork::GetTrainingCount()
{
    return m_trainingCount;
}

void NeuralNetwork::SetTrainingCount(int count)
{
    m_trainingCount = count;
}

/*****************************************************************************/
// Neuron class functions
/*****************************************************************************/
// Constructor
Neuron::Neuron()
{
    // Give the neuron an initial value
    m_value = 0;

    m_deltaValue = 0;
}

// Returns the output value for the neuron
double Neuron::GetValue()
{
    return m_value;
}

// Sets the output value for the neuron
void Neuron::SetValue(double value)
{
    m_value = value;
}

// Returns the delta value for the neuron
double Neuron::GetDeltaValue()
{
    return m_deltaValue;
}

// Sets the delta value for the neuron
void Neuron::SetDeltaValue(double value)
{
    m_deltaValue = value;
}

// Adds a new connection to the connection vector
void Neuron::AddConnection(double weight)
{
    m_connections.push_back(weight);
}

// Returns the connection weight to another neuron
double Neuron::GetConnectionWeight(int neuron)
{
    return m_connections[neuron];
}

// Sets the connection weight to another neuron
void Neuron::SetConnectionWeight(int neuron, double weight)
{
    m_connections[neuron] = weight;
}

/*****************************************************************************/
// Neuron class functions
/*****************************************************************************/
// Constructor
NeuronLayer::NeuronLayer()
{
}

// Adds a neuron to the neuron vector
void NeuronLayer::AddNeuron()
{
    m_neurons.push_back(new Neuron());
}

// Returns a pointer to a given neuron for this layer
Neuron *NeuronLayer::GetNeuron(int neuron)
{
    return m_neurons[neuron];
}

int NeuronLayer::GetNeuronCount()
{
    return m_neurons.size();
}

#endif

[Edited by - chadjohnson on August 23, 2005 12:58:38 PM]

Share this post


Link to post
Share on other sites
Advertisement
Guest Anonymous Poster
Try setting MAX_TRAINING to 1

If that doesn't work try using batch training. That is, generate 100 or so random input output pairs, then train on all 100 of these pairs at once. You can do this by adding up the changeInConnectionWeight's for each of the 100 input output pairs and then using the average.

Share this post


Link to post
Share on other sites
Sounds like you hit a local extremum instead of the optimal function. Haven't had time to look through all the source code, though.

500 neurons in your hidden layer sound like a lot too me. Maybe reducing this give you better result. (You approximate function becomes of a lower-dimension, so has less change of creating a local extremum).

Share this post


Link to post
Share on other sites
Maybe I just missed it while skimming your code, but you should have an input separate from your other inputs that represents the 'bias', and it should always have a value of 1. I had this problem in the past, and that solved my problems. I hope this helps. As a side note, 500 hidden nodes is way more than necessary. I think I read somewhere that a good estimate for your hidden node count should be roughly equal to the square root of your input count. Obviously kind of arbitrary, but commonly accepted. Good luck.

Scott

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!