Neural Net - Gesture Recognition

Started by
1 comment, last by neoaikon 16 years, 1 month ago
I've implemented my first ANN - a simple multi-layered feedforward ANN using backpropagation, and tried it out with some dummy problems (XOR etc). I'm trying to move up to having it recognize Wiimote gestures, like forward- or backhand swings. The raw data I have is a set of acceleration data over time (around 200 - 400). My question is, how should I process the data for input? I'm pretty sure I need to get rid of gravity and reduce the number of inputs somehow... Also, how many hidden units would I need? What's the standard way of determining this? Crossvalidate? Thanks in advance!
[size="1"]Try GardenMind by Inspirado Games !
All feedback welcome.
[s]
[/s]

[size="1"]Twitter: [twitter]Owen_Inspirado[/twitter]
Facebook: Owen Wiggins

[size="1"]Google+: Owen Wiggins

Advertisement
It depends on what your data looks like; if the acceleration is just a vector of acceleration values (x, y, z direction?) over time, then perhaps subsampling is an option. You could also try to extract the most useful features of the acceleration vector; such as sharp increases or decreases (quick and sudden movements).
Determining the number of hidden units is tricky; there is no standard way to do this other than to fiddle around with the number. The problem is that with too few hidden neurons, the network will fail to generalize well, and with too many neurons the network will start remembering (or storing) patterns instead of generalizing. There is a kind of rule of thumb for this however, known as the Baum-Haussler rule.
Depending on your data though, the number of hidden neurons can vary, and you'll have to determine it just by trial and error.
Well since you've created an ANN, this won't be very hard to explain. Basically, for the WiiMote (I've never messed with it, but I do understand it keeps its orientation in 3d space. I plan on maybe getting a Wii after tax-refund day, so I might to mess with this), you'd probably have 3 directions (x, y, and z) at least.

These vectors are the amount the remote is aimed toward the sensor, you can think of X as left and right, y as up and down, and z as forward and backward. You would create a ANN that has 3 inputs, going to a hidden layer, and then to the outputs, we'll say we have 6 outputs, 2 for x,y, and z. What you would then have to do, is train the network to recognize the changes in the x,y, z direction.

What you would do then is setup a way to monitor the output of the neural net, lets say a 6 by 1 grid of cells, we'll call it the output grid. when output 1 lights up, cell 1 changes color, when output 2 lights up, cell 2 changes, and so on. Then you would create an identical grid, lets call it this training grid, this grid is only read by the ANN, and only you can change the colored/uncolored state of each cell.

You would put in the desired output into the training grid, run the ANN while doing the movement your training for. If the output grid is correct, then you don't need to do anything, but if it isn't, the ANN will need to check each neuron in the ANN, and correct the weights using a Learning Rule Algorithm. There are many Learning Rules, some are better than others. I personally use the Back-Propagation Algorithm (http://en.wikipedia.org/wiki/Back-propagation) with fantastic results. The Hebbian Learning Rule is also popular, and easy to implement. (http://en.wikipedia.org/wiki/Hebb%27s_rule)

Thats the simplest way to do it, although it'll only recognize the left/right, up/down, back/forth movements. If you wanted to draw an F in the air and have the ANN recognize it, you'd have to monitor the WiiMote output over a period of time, and feed that to the ANN rather than the general movements. Hope this helps.

This topic is closed to new replies.

Advertisement