Principal Components / Factor analysis and Neural Nets

Started by
4 comments, last by DarkThrone 19 years ago
Neural nets use a number of inputs, and overall system complexity depends somewhat upon that number - that is, as you increase the number of inputs, you increase the number of connections. What happens if you use some form of multivariate technique to reduce the number of inputs? In particular, I'm thinking of using Factor or Principal Components analysis to reduce a high dimension input space into a smaller set of components. Presumably this would allow you to reduce the complexity of the ANN. As a corollary - can this be built into the net itself; that is, can the selection of the number of principal components, or the degree of input-parameter factorisation, be a component of the net building? One would presumably need some sort of goodness of fit function, maybe a function of the number of connections (and therefore execution time) and adherence to desired output. Thanks for any thoughts! Jim.
Advertisement
Reading your post, the whole complexity = F(#inputs) reminded me of a website, AI@Home. From what I understand (very little), they use spiking neural nets, so not just the strength of the input, but its timing, matters in the final output. I'd guess there are probably many similar ways of increasing the complexity of a single input (in fact, just in thinking about this post, I'd bet that in Laplace space, there'd be some really easy operators: integrate, differentiate, delay, etc., that could be used).

(so this post is probably the reverse of what you are asking...sorry)
Most PCA toolboxes provide a scalar output with each canonical variable denoting in a sense how much information is carried in that variable only. Or if you write your own it is not difficult to add. Therefore, you can design the whole system from your test data to use the canonical variables that, for instance, describe your test data to within a 95% accuracy. Then, project your test data into the new space, reduce dimensionality according to the criterion above and train your ANN.
I agree with tomlu. Typically what I've seen with PCA, is to generate the components and use those which explain 80% (arbitrary number) of the variance in your data. These become the inputs to the net.

The issue I have with PCA, is that you lose interpretability. A component is a linear combination of the original variables. You can go back and determine which variables contribute most, etc., but you still have a mixture of variables in each component. This prevents you from being able to look at any particular variable and determine its impact on the model. (That being said, neural networks are nearly impossible to interpret anyway, so....)

There are also neural network architectures which extract components for you. If you have, say, 10 original variables, imagine a network with 10 inputs, 3 hidden layers, and 10 outputs. Train the network such that the outputs equal the original inputs. Now the weights of the hidden layers are analagouos to principal components, BUT you don't have a guarantee that they are true PCs. Another interesting method is to have a larger number of layers - 10 inputs, layer 1 with X many nodes, layer 2 with Y many nodes, layer 3 with X many nodes, and 10 outputs. Now, layer 2 with Y many nodes are a non-linear set of components. Interesting stuff.

Finally, if you're interested in reducing the dimensionality of your data set, there are lots of methods. The first is to elimiate correlated variables. I usually look for correlations >0.80 and get rid of the less attractive variable of the pair. There are lots of great methods for feature seletion, or picking which variables to use in your model. If you interested, a great reference is Jain, A. and D. Zongker (1997). "Feature selection: Evaluation, application, and small sample performance." IEEE Transactions on Pattern Analysis and Machine Intelligence 19(2): 153-8.

I hope this helps. Let me know if I can clarify.

-Kirk


Thanks all for the comments - exactly what I was looking for.

I'm still getting my head around how to develop ANN's, so this sort of help is much appreciated.

kirkd - also responded to one of your topics on Matt Bucklands site.

Jim.
The great idea is to create a Event Class to simulate a Neural Net. You can define a serie of factors to analisys and each time a information is filtered by the Event, this returns a result dependant of this analisys. You can build a powerful class that allow a multiple Event web, that is exactly you want for. If you pass to a Event an changeable argument, this can returns to you a enumerated range of possibilities.

This topic is closed to new replies.

Advertisement