Jump to content
  • Advertisement
Sign in to follow this  
bmanruler

Problem with Neural Network

This topic is 3795 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I wrote up a feed forward neural network with one input, hidden and output layer. The algorithms were based off of this website. The problem comes when I train a 2 input, 2 hidden and 1 output network on XOR. The output of the network never is zero, so the problem is in my backpropagation function. neuralnet.hpp
#ifndef BMP_LIBRARY_NEURAL_NET_HPP
#define BMP_LIBRARY_NEURAL_NET_HPP

#include <vector>
#include <boost/random.hpp>
using boost::uniform_01;
using std::vector;

namespace bmplib
{
	// Typedefs for easy changes later on
	typedef double number;
	typedef std::vector< number > layer;
	typedef std::vector< layer > bundle;

	//-----------------------------------------------------------------------------
	// This class is a simple feed forward neural network that consists of 3 layers
	// 1 input layer, 1 hidden layer and 1 output layer
	// Each layer is fully interconnected
	// The amount of neurons are set at class creation time and cannot be changed
	//-----------------------------------------------------------------------------
	class NeuralNet
	{
	private:
		// Number of neurons in each layer
		const int inputCount_,
			hiddenCount_,
			outputCount_;

		const number sigAleph_;

		// These are the weights between the previous layer and the mentioned layer
		// It is a two dimensional array so you can look up via:
		//  i = index in previous layer
		//  j = index in mentioned layer
		// So weight[j] is the connection between base and hidden[j]
		bundle hiddenWeights_,
			outputWeights_;

		layer inputData_,
			inputTheta_,
			inputOutput_,
			hiddenInput_,
			hiddenTheta_,
			hiddenOutput_,
			outputInput_,
			outputTheta_,
			outputData_,
			correctData_;

		// Use boost random functionality to create good random numbers
		boost::mt19937 randomEngine_; // engine that makes the random numbers
		boost::uniform_01< boost::mt19937, number > randomNumber_; // generates number type [0,1)

	public:
		//-----------------------------------------------------------------------------
		// Name: Constructor
		// Purpose: Constructs the NeuralNet
		// Pre: All int values are > 0
		// Post: Valid NeuralNet is created
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		NeuralNet( int InputNeurons, int HiddenNeurons, int OutputNeurons );

		//-----------------------------------------------------------------------------
		// Name: Destructor
		// Purpose: Destroys the object
		// Pre: None that you need concern yourself with
		// Post: Object is destroyed
		// Guarantee: No-throw
		//-----------------------------------------------------------------------------
		~NeuralNet();

		//-----------------------------------------------------------------------------
		// Name: setInputData
		// Purpose: Takes a layer and assigns the values to the input.
		// Pre: The number of input layers and the size of the passed in array must be the same.
		// Post: The passed data is assigned to the object.
		// Guarantee: 
		//-----------------------------------------------------------------------------
		void setInputData( const layer & );

		//-----------------------------------------------------------------------------
		// Name: getOutputData
		// Purpose: Returns the value of the output of the network.
		// Pre: Should not be called before network is updated.
		// Post: Returns values in output neuron layer
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		const layer & getOutputData() const;

		//-----------------------------------------------------------------------------
		// Name: setCorrectData
		// Purpose: Sets the 'correct' results for the given input.
		// Pre: Passed layer is same length as output neuron count for network.
		// Post: Passed data is assigned to the object.
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void setCorrectData( const layer & );

		//-----------------------------------------------------------------------------
		// Name: PrintAndRun
		// Purpose: Diagnostic running
		// Pre: input data has been set
		// Post: runs all features of the net and prints results
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void PrintAndRun();

		//-----------------------------------------------------------------------------
		// Name: Train
		// Purpose: Trains the network with training data.
		// Pre: None
		// Post: Network should be 'smarter'
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void Train( const layer & InputData, const layer & CorrectData );

		//-----------------------------------------------------------------------------
		// Name: Run
		// Purpose: Uses input data and runs the network with no training.
		// Pre: Input data has been set.
		// Post: None
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void Run();

		//-----------------------------------------------------------------------------
		// Name: Run
		// Purpose: Runs the network with no training.
		// Pre: Pass in valid input data.
		// Post: Returns output data.
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		const layer &Run( const layer & InputData );

	private:
		//-----------------------------------------------------------------------------
		// Name: Init
		// Purpose: Private utility function to initialize class.
		// Pre: None
		// Post: Class initialization is finished.
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void Init();

		//-----------------------------------------------------------------------------
		// Cleanup function
		//-----------------------------------------------------------------------------
		//-----------------------------------------------------------------------------
		// Name: Cleanup
		// Purpose: Private utility function that performs cleanup work for destructor.
		// Pre: Object is valid.
		// Post: Object is fully ready to be destroyed.
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void Cleanup();

		//-----------------------------------------------------------------------------
		// Sigma function
		//-----------------------------------------------------------------------------
		//-----------------------------------------------------------------------------
		// Name: Sigma
		// Purpose: Takes a number and returns the sigma function results
		// Pre: None
		// Post: Returns ( 1.0 / ( 1.0 + exp( -X )
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		number Sigma( const number &X ) const;

		//-----------------------------------------------------------------------------
		// Updates all the values in the Value lists
		//-----------------------------------------------------------------------------
		//-----------------------------------------------------------------------------
		// Name: FeedForward
		// Purpose: Private utility function that propagates all the values through the network.
		// Pre: Object has been initialized.
		// Post: All values match what the weights and input say they should be.
		// Guarantee: Strong
		//-----------------------------------------------------------------------------
		void FeedForward();

		//-----------------------------------------------------------------------------
		// Name: Backprop
		// Purpose: Does the actual 'learning' of the network; adjusts the weights to match training data.
		// Pre: Object has been initialized.
		// Post: Net now more closely fits training data.
		// Guarantee: Strong
		//---------------------------------------------------------------------------
		void Backprop();

	};

} // end namespace::NeuralNetwork
#endif // BMP_LIBRARY_NEURAL_NET_HPP

neuralnet.cpp
#include "neural_net.hpp"

#include <cmath> // exp() in Sigma
#include <iostream>
using std::cout;
using std::endl;
#include <boost/random.hpp> // random_uniform
using boost::uniform_01;

namespace bmplib
{
	//-----------------------------------------------------------------------------
	// Constructor
	// Pre:
	// Post:
	// Guarantee: Strong
	//-----------------------------------------------------------------------------
	NeuralNet::NeuralNet( int InputNeurons, int HiddenNeurons, int OutputNeurons )
	: inputCount_( InputNeurons ),
	hiddenCount_( HiddenNeurons ),
	outputCount_( OutputNeurons ),
	sigAleph_( 1.0 ), // set the aleph value for Sigma function
	inputData_( inputCount_, 0.0 ),
	inputTheta_( inputCount_, 0.0 ),
	inputOutput_( inputCount_, 0.0 ),
	hiddenInput_( hiddenCount_, 0.0 ),
	hiddenTheta_( hiddenCount_, 0.0 ),
	hiddenOutput_( hiddenCount_, 0.0 ),
	outputInput_( outputCount_, 0.0 ),
	outputTheta_( outputCount_, 0.0 ),
	outputData_( outputCount_, 0.0 ),
	correctData_( outputCount_, 0.0 ),
	hiddenWeights_( inputCount_, layer( hiddenCount_, 0.0 ) ),
	outputWeights_( hiddenCount_, layer( outputCount_, 0.0 ) ),
	randomNumber_( randomEngine_ )
	{
		Init();
	}

	//-----------------------------------------------------------------------------
	// Destructor
	// Pre:
	// Post:
	// Guarantee: No-throw
	//-----------------------------------------------------------------------------
	NeuralNet::~NeuralNet()
	{
		try
		{
			Cleanup();
		}
		catch( ... )
		{
			// swallow all exceptions
		}
	}

	//-----------------------------------------------------------------------------
	// Init()
	// Purpose: Initialization routine
	// Pre:
	// Post:
	// Guarantee: Strong
	//-----------------------------------------------------------------------------
	void NeuralNet::Init()
	{
		// Set theta values
		for( int in = 0; in < inputCount_; in++ )
		{
			inputTheta_[in] = randomNumber_();

			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				hiddenWeights_[in][hid] = randomNumber_();
			}
		}
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			hiddenTheta_[hid] = randomNumber_();

			for( int out = 0; out < outputCount_; out++ )
			{
				outputWeights_[hid][out] = randomNumber_();
			}
		}
		for( int out = 0; out < outputCount_; out++ )
		{
			outputTheta_[out] = randomNumber_();
		}
	}

	//-----------------------------------------------------------------------------
	// Cleanup()
	// Purpose: Cleans up for destructor
	// Pre:
	// Post:
	// Guarantee: Strong
	//-----------------------------------------------------------------------------
	void NeuralNet::Cleanup()
	{
	}

	//-----------------------------------------------------------------------------
	// Returns the value of the Sigma function
	// Pre:
	// Post:
	// Guarantee: Strong
	//-----------------------------------------------------------------------------
	number NeuralNet::Sigma( const number &X ) const
	{
		return static_cast< number >( 1.0 / ( 1.0 + exp( sigAleph_ * -X ) ) );
	}

	//-----------------------------------------------------------------------------
	// setInputData
	// Sets the data in the input layer of the network
	//-----------------------------------------------------------------------------
	void NeuralNet::setInputData( const layer &Data )
	{
		for( int i = 0; i < inputCount_; i++ )
		{
			inputData_ = Data;
		}
	}

	//-----------------------------------------------------------------------------
	//-----------------------------------------------------------------------------
	const layer & NeuralNet::getOutputData() const
	{
		return outputData_;
	}

	//-----------------------------------------------------------------------------
	//-----------------------------------------------------------------------------
	void NeuralNet::setCorrectData( const layer &Data )
	{
		for( int i = 0; i < outputCount_; i++ )
		{
			correctData_ = Data;
		}
	}

	void NeuralNet::PrintAndRun()
	{
		// Update network
		FeedForward();
		cout << "Output:  ";
		for( int i = 0; i < outputCount_; i++ )
		{
			cout << outputData_ << " ";
		}
		cout << endl << "Correct: ";
		for( int i = 0; i < outputCount_; i++ )
		{
			cout << correctData_ << " ";
		}
		cout << endl;
		Backprop();
	}

	void NeuralNet::Train( const layer &InputData, const layer &CorrectData )
	{
		setInputData( InputData );
		setCorrectData( CorrectData );

		FeedForward();
		Backprop();
	}

	void NeuralNet::Run()
	{
		FeedForward();
	}

	const layer &NeuralNet::Run( const layer & InputData )
	{
		setInputData( InputData );
		FeedForward();
		return outputData_;
	}

	//-----------------------------------------------------------------------------
	//-----------------------------------------------------------------------------
	void NeuralNet::FeedForward()
	{
		// calculate output from input layer
		for( int in = 0; in < inputCount_; in++ )
		{
			inputOutput_[in] = Sigma( inputData_[in] );
		}
		// calculate input to hidden layer
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			hiddenInput_[hid] = 0.0; // zero it out first

			for( int in = 0; in < inputCount_; in++ )
			{
				hiddenInput_[hid] += inputOutput_[in] * hiddenWeights_[in][hid];
			}
		}
		// calculate output from hidden layer
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			hiddenOutput_[hid] = Sigma( hiddenTheta_[hid] + hiddenInput_[hid] );
		}
		// calculate input to output layer
		for( int out = 0; out < outputCount_; out++ )
		{
			outputInput_[out] = 0.0;

			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				outputInput_[out] += ( hiddenOutput_[hid] * outputWeights_[hid][out] );
			}
		}
		// calculate final output
		for( int out = 0; out < outputCount_; out++ )
		{
			outputData_[out] = Sigma( outputInput_[out] + outputTheta_[out] );
		}
	}

	//-----------------------------------------------------------------------------
	//-----------------------------------------------------------------------------
	void NeuralNet::Backprop()
	{
		const number Lambda1 = 0.20;
		const number Lambda2 = 0.15;

		// How much to change output theta
		layer DeltaThetaOut( outputCount_, 0.0 );
		
		// find how much to change output layer
		for( int out = 0; out < outputCount_; out++ )
		{
			DeltaThetaOut[out] = outputData_[out] * ( Lambda1 - outputData_[out] ) * ( correctData_[out] - outputData_[out] );
			DeltaThetaOut[out] *= Lambda1;
		}
		
		bundle DeltaWeightOut( hiddenCount_, layer( outputCount_, 0.0 ) );

		// find delta value for weights between hidden and output layers
		for( int out = 0; out < outputCount_; out++ )
		{
			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				DeltaWeightOut[hid][out] = hiddenOutput_[hid] * DeltaThetaOut[out];
			}
		}

		layer DeltaThetaHidden( hiddenCount_, 0.0 );

		// find delta values for hidden layer
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			for( int out = 0; out < outputCount_; out++ )
			{
				DeltaThetaHidden[hid] += outputWeights_[hid][out] * ( DeltaThetaOut[out] / Lambda1 );
			}
		}

		// Delta value for hidden layer finally computed
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			DeltaThetaHidden[hid] = hiddenOutput_[hid] * ( Lambda2 - hiddenOutput_[hid] ) * DeltaThetaHidden[hid];
			DeltaThetaHidden[hid] *= Lambda2;
		}

		// Find delta weights for input to hidden layer

		bundle DeltaWeightHidden( inputCount_, layer( hiddenCount_, 0.0 ) );

		for( int in = 0; in < inputCount_; in++ )
		{
			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				DeltaWeightHidden[in][hid] = DeltaThetaHidden[hid] * inputOutput_[in];
			}
		}

		// Find delta theta's for input layer

		layer DeltaThetaInput( inputCount_, 0.0 );

		for( int in = 0; in < inputCount_; in++ )
		{
			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				DeltaThetaInput[in] += hiddenWeights_[in][hid] * ( DeltaThetaHidden[hid] / Lambda2 );
			}
		}

		for( int in = 0; in < inputCount_; in++ )
		{
			DeltaThetaInput[in] = inputOutput_[in] * ( Lambda2 - inputOutput_[in] ) * DeltaThetaInput[in];
			DeltaThetaInput[in] *= Lambda2;
		}

		// Now all we have to do is apply all the changes :/
		// Start with output theta's
		for( int out = 0; out < outputCount_; out++ )
		{
			outputTheta_[out] += DeltaThetaOut[out];

			// Then do output weights
			for( int hid = 0; hid < hiddenCount_; hid++ )
			{
				outputWeights_[hid][out] += DeltaWeightOut[hid][out];
			}
		}
		// Now hidden layer theta's
		for( int hid = 0; hid < hiddenCount_; hid++ )
		{
			hiddenTheta_[hid] += DeltaThetaHidden[hid];

			// Then hidden weights
			for( int in = 0; in < inputCount_; in++ )
			{
				hiddenWeights_[in][hid] += DeltaWeightHidden[in][hid];
			}
		}

		// And finally adjust input theta's
		for( int in = 0; in < inputCount_; in++ )
		{
			inputTheta_[in] += DeltaThetaInput[in];
		}
	}

} // end namespace::NeuralNetwork

main.cpp
// main.cpp

#include "neural_net.hpp"
#include "bmp_timer.h"
using namespace::bmplib;

#include <iostream>
using std::cout;
using std::endl;

int main()
{
	// 0,0,0
	// 0,1,1
	// 1,0,1
	// 1,1,0

	bundle XorIn( 4, layer( 2, 0.0 ) );
	XorIn[1][1] = 1.0;
	XorIn[2][0] = 1.0;
	XorIn[3][0] = 1.0; XorIn[3][1] = 1.0;

	bundle XorOut( 4, layer( 1, 0.0 ) );
	XorOut[1][0] = 1.0;
	XorOut[2][0] = 1.0;


	NeuralNet Net(2,2,1);

	for( int i = 0; i < 1000; i++ )
	{
		for( int j = 0; j < 4; j++ )
		{
			Net.setInputData( XorIn[j] );
			Net.setCorrectData( XorOut[j] );
			Net.PrintAndRun();
		}
		cout << endl;
	}

	return 0;
}

Share this post


Link to post
Share on other sites
Advertisement
Two things:

1. Do you have automated unit/functional tests? If so, which one fails? Have you tried breaking it down to a simpler test?
2. If you don't have tests, go to 1. Trying to implement something for the first time is much easier when you test it!


Share this post


Link to post
Share on other sites
Your sigma function, which determines the output from each of your layers, can only be zero when its input is negative infinity. The statement "The output of the network never is zero, so the problem is in my backpropagation function." is clearly incorrect. No matter what training algorithm you use, you can't force a function that doesn't return zero to return zero.

Share this post


Link to post
Share on other sites
Quote:
Original post by Vorpy
Your sigma function, which determines the output from each of your layers, can only be zero when its input is negative infinity.

Thanks, I will rework this.

Quote:
Original post by Vorpy
The statement "The output of the network never is zero, so the problem is in my backpropagation function." is clearly incorrect. No matter what training algorithm you use, you can't force a function that doesn't return zero to return zero.

Sorry about the misdirection, it was pretty late when I posted. I guess I glanced over all the nice graphs when I was coding.

When I change the Sigma function to something that can return 0 the network still evolves it to the maximum value the Sigma can return.

Share this post


Link to post
Share on other sites
What are Lambda1 and Lambda2 supposed to be? What is the meaning of the (Lambda1 - outputData_[out]) terms, and similiar terms in the update calculations for other nodes? It looks like they are supposed to be learning rates, but then these terms don't make any sense. It's like a distorted version of the derivative of the sigmoid function, with 1 being replaced by a different value for no reason.

Share this post


Link to post
Share on other sites
This is an excerpt from the guide I used as the basis.
Quote:

Before we explain the training, let’s define the following:
l (Lambda) the Learning Rate: a real number constant, usually 0.2 for output layer neurons and 0.15 for hidden layer neurons.
D (Delta) the change: For example Dx is the change in x. Note that Dx is a single value and not D multiplied by x.
6.2 Output Layer Training
Let z be the output of an output layer neuron as shown in section ‎4.
Let y be the desired output for the same neuron, it should be scaled to a value between 0 and 1. This is the ideal output which we like to get when applying a given set of input.
Then e (the error) will be:

e = z * (1 – z) * (y – z)
Dq = l * e ... The change in q
Dwi = Dq * xi ... The change in weight at input i of the neuron


So you can see how I might have gotten a lambda and a one mixed up.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!