• Advertisement


  • Content count

  • Joined

  • Last visited

Community Reputation

100 Neutral

About essexedwards

  • Rank
  1. Artificial Neural Networks

    Victor-Victor: We seem to be discussing these options: 1. Include Bias Input. 2. Use step (heaviside) activation function. 3. Give each neuron a different 'threshold' for the step. I was confused by what you meant by 'threshold'. I thought you just meant using a step function as the activation function (ie. 2). I now think you meant giving each neuron a different value where the step occurs (ie. 2+3). I agree that that is completely the same as using a bias and having every threshold step at zero (ie. 1+2 = 2+3). However, it is definitely not the same as having no threshold activation function at all (which is what I thought you were asking). As an argument for why the standard sigmoid might be useful (besides derivatives), consider making a network with 2 inputs (x,y) that outputs 1 when x^2+y^2<1 and 0 otherwise. Using threshold activation functions will have to approximate this circle as a polygon. Using sigmoid will actually be able to have a curved boundary following part of the circle. Either will work, of course, but I would be surprised if sigmoid didn't get more accuracy with fewer neurons. (I've ignored how to map the sigmoid network's real output to boolean. Let's just say we switch at 0.5).
  2. Artificial Neural Networks

    This thread is just too painful to watch and not reply. Bias is so your output can be non-zero even when all your inputs are zero. For example, how would you make an ANN that can compute NOT, with 1 input and 1 output? Assuming no threshold or bias, then each 'neuron' in the network just outputs a linear combination of its inputs. So, the entire network's outputs can be reduced to a linear combination of its inputs (ie. a big matrix multiply). A network like this could never learn the function x^2, or 1-x. Adding a bias, the ANN can represent arbitrary affine functions. So, 1-x can be reproduced by an ANN with bias, but x^2 is still not possible. Notice that a threshold and a bias are not the same, because a threshold would not help with this problem. I don't know too much about thresholds, but let me take a stab. A threshold is just another 'activation' or 'transfer' function. Quite often, you'll see the sigmoid function used for this purpose. The sigmoid function adds enough functionality that the ANN can approximate any function, for example x^2. I suspect that using threshold as the activation function makes the network's output a piecewise combination of affine functions. This will also let you approximate any function (ie. x^2), but only with straight segments. To answer your last question, your network requires bias and threshold almost always. It is the special case when you can get away without them. I do strongly encourage you to think about how you would implement NOT, AND, OR, and XOR with a ANN. They are very small networks you can do in your head or on scrap paper, but they should help explain some of these issues. -Essex
  • Advertisement