Motor control for physics-based character movement

Started by
33 comments, last by Aken H Bosch 9 years, 6 months ago

Well, the meaning of "torque" is the same as what most people mean by it... torque = MoI * angular acceleration...

Each of the joints of my Dood effectively has a motor in it, and whatever delta-angular-momentum the motor induces in one bone, it induces the opposite delta-angular-momentum in the other bone. The physics engine's constraint solver resolves the linear components of the velocities to make sure the bones stay in their sockets (although this may in turn alter the angular velocities of the bones as well).

I can let each bone have a desired orientation. If I compare its current orientation to the desired orientation, I can compute the angular velocity necessary to get it there by the next timestep. Then if I compare that desired angular velocity to the current angular velocity, I can get the angular acceleration necessary to reach that angular velocity by the next timestep. And then by multiplying that desired angular acceleration by the oriented MoI matrix for that bone (I'm using a 3x3 matrix to store the MoI data, even though it only has 6 unique numbers), I can get the torque necessary to get that bone into its desired orientation by the next timestep.

So then I tell the motor in one of the joints this bone has to set its torque (in world coordinates) to that value. If another motor has already affected this bone, then I also need to undo whatever it did... so if I have a chain of bones with motors between, and each bone has a desired orientation, I have to add up the world-space torque vectors as I go through the chain. And when I get to the last bone in the chain, all of the motors have already had their torques set to accommodate the desired orientation of the previous bone... so there is no motor left to accommodate the desired orientation of the last bone in the chain. Thus I call it a 'sink' for the torques of the rest of the chain.

As far as the lower and upper body being separate... I just happened to decide to start from the arms and head, and work down until I got to the pelvis. The upper body stuff works, but at the cost of the pelvis doing completely arbitrary stuff, whatever spastic motion is necessary to keep the bones above it oriented as they desire. As I said before, ultimately this approach cannot be generalized to work for the whole body, because I can only satisfy as many bones' desired orientations as I have joint motors, and there are more bones than joints.

Will look at some of the other links that have been posted when I have more time.

Signature go here.
Advertisement

Good explanation, Aken. Clarifies the situation .. except why you can't have more joints/joint motors(?) The reasons for the restrictions aren't clear.

I understand that modeling the lower body and the ground's effect on it may not be trivial, but why is not possible?

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.

If I understand your first question...

The problem is conservation of angular momentum. Say A and B are the angular momentum of two bones. If a joint motor between them does A += X, in order to conserve angular momentum, it must also do B -= X.

As I explained before, any desired orientation to be reached by the next timestep corresponds to a desired delta-angular-momentum. Ignoring the possibility of going >2pi radians to get there, it's a 1:1 correspondence. So if the bones I mentioned earlier have desired orientations, then I can talk about Adesired and Bdesired. But in general I can't satisfy both at once: if I choose X so that A' = A + X = Adesired, B' = B - X will not generally = Bdesired.

If I add a third bone connected by a second joint (new objects' properties will be "C" and "Y")...

A' = A + X

B' = B - X + Y

C' = C - Y

I can choose Y so that B' = Bdesired

Y = X + Bdesired

but then C' will not generally = Cdesired.

In general, there is no way to choose X and Y such that A' = Adesired, B' = Bdesired, and C' = Cdesired, simultaneously. If I choose to satisfy the desired orientations of everything but the last bone in a chain, that last bone will end up with a delta-angular-momentum = -(X + Y + Z + ...).

Other stuff to note...

There's some approximation going on here, because A', B', etc. are the values that go into the constraint solver... whereas it's the values that come out of the constraint solver that determine what the actual position/orientation of the bones is at the next timestep. I haven't studied just how much of a difference that makes.

Also, it just occurred to me that I've been working in terms of desired bone orientations that must be reached by the next timestep... and perhaps that "by the next timestep" part is an unreasonable limitation. It occurs to me, for example, that a cat's righting reflex is at least a 3-step process: the cat starts out sideways, step 1 is to fold itself up into a torus, step 2 is to rotate the torus' surface (as if turning itself "inside out"), and then finally step 3 is to unfold the torus, leaving the cat right-side-up.

I don't think it's impossible... clearly it's possible; we do it IRL. It just isn't a simple generalization of the approach that worked for the upper body.

YouTube video of the working upper body stuff, if anybody is interested:

(warning: the gunfire sound effect is sudden and loud; consider turning your volume down before watching)

Signature go here.

The problem is conservation of angular momentum.

The principle of conservation of angular momentum applies to closed systems. A character standing on the ground (unless you include the ground as part of your system) is not a closed system. It appears your model is for a character in free space, and (from the video) you may well be getting the results your model dictates. If you extend your character's interaction to include forces and torques with the ground (as intimated by MrRowl above,) that should help the situation.

Please don't PM me with questions. Post them in the forums for everyone's benefit, and I can embarrass myself publicly.

You don't forget how to play when you grow old; you grow old when you forget how to play.


The principle of conservation of angular momentum applies to closed systems. A character standing on the ground (unless you include the ground as part of your system) is not a closed system.

True, but the foot/ground interaction will only affect the system in the ways the constraint solver makes it. It's not a free pass to give bones arbitrary angular velocities. I mean, yes I could cheat, but I don't want to.

Failure to conserve angular momentum is a serious problem. About two years ago I had a system that (inadvertently) didn't conserve angular momentum. Each bone had a desired position and orientation. I computed the average linear velocity of the Dood, modified the linear and angular velocities of the bones to be whatever was necessary to get them to their desired pos/ori by the next timestep, and then subtracted out any net change to the linear velocity of the Dood to make it at least conserve linear momentum. Note, this is not part of the constraint solver I'm talking about; this is an extra step which, if done incorrectly, could cause the Dood's linear or angular momentum to change, completely independent of interactions with other objects.

Here's a video from back then (although I seem to have avoided showing the serious problems that failure to conserve angular momentum caused):

(warning: the gunfire sound effect is sudden and loud; consider turning your volume down before watching)

On to the problems:

The old system effectively said "I don't care what your angular velocities were; here's what they are now." As a result, characters could not be rotated by being pushed on. This includes interactions with the ground. The enemies the player is going to fight in my FPS[1] are giant bug aliens (but unlike nearly every other universe with "bugs" mine actually have six legs). Two very obvious problems arise when these bugs, with their wide stance, interact with these sloppy physics:

  1. They can stand with only one foot in contact with the ground, while the center of mass is suspended in the air above a precipice.
  2. When the larger bugs (not shown in the video) try to rotate, their feet slam into the ground at arbitrarily high speeds, causing a reaction impulse which sends them careening across the map.

Yes, I realize both of these problems could hypothetically have been remedied by improving how the desired bone orientations were selected. But that's not the path I've chosen.

[1]At one point "do it in a less cheaty manner" was a "side quest" of "make an FPS where you fly around with jetpacks and fight giant space bugs". But I've decided that this is what I want to do now. That said, I am still hoping for a solution that will be generalizable to work for the six-legged bug enemies.

Signature go here.

Assuming the AI part will come later after the 'complex action' part exists it will make use of...

Primitives of course of getting the 'body'/ appendages/whatever to be stable in whatever environment they live in (try to make a biped stand/balance (then walk in a particular direction) on an irregular floor surface...) Then includes various transition to different orientations within that terrain (all of course within the limitations of the body's individual appendages (allowed angles, movement inertias, etc)

Then fluid movements to move the body/appendages to where they are needed (a given target) when adjacent to a blocking environment structure (how to move the whole body to transition from a 'hand' outsidfe a hole to inside while being mideful of the physical limitations of the structure)

Next applying force on an external object to make it move in a desired direction (shove it with proper force and friction and within impact/force limits of the appendage/body (obviously countering forces to maintain balance of the body - later with more agressive stances and motions to impart greater force -- like a pitcher winding up for his pitch)

Trajectories to be decided (balistic path calculated to work around obstacles -- besides actually getting the object to its desired locations)

All of the above with the system being able to judge that the task given is impossible - maybe suggesting alternatives (like getting closer to the target or a position which obstacles arent in the way, etc...)

--------------------------------------------[size="1"]Ratings are Opinion, not Fact

@wodinoneeye:

I don't understand what you're saying? A list of considerations for once I get the low-level stuff working? Or a plan of action?

I actually had done a bit of AI coding a long time ago, before I completely revamped the character physics... well, A* pathfinding, at least. And I have some general ideas how to do the layers that pass goal states (or whatever) to the low level stuff I'm working on... how to choose where I want the feet to go, that sort of thing...

What I've been trying recently...

Recently (the past couple of months) I've been experimenting with using genetic algorithms to select coefficients for an artificial neural network, converting most variables of the the state of the Dood to an array of float inputs, and also giving it a goal state in the form of a desired linear and angular velocity for the left foot, right foot, and pelvis... or expressing it as forces as torques (I've tried things a few different ways at various points). I give these inputs to the ANN, and it outputs an array of float joint torques for each axis of the left and right ankle, knee, and pelvis joints.

The ANN is a custom setup. Here's how that works:

every physics tick (the physics runs at 60hz):

repeat this for some number of iterations:

(outputs, memories') = tanh((inputs, memories) * (coefficient matrix))

Note, these "memories" serve both as addresses to store the results of intermediate computations within a single physics tick, and to record these results for use in a subsequent physics tick.

Currently there are 150 inputs and 18 outputs; the number of inputs is more subject to change than the number of outputs.

I believe this sort of non-feed-forward setup is necessary because of the complicated nature of the foot/ground interaction. I cannot rely on there being exactly one contact point between the foot and the ground, nor can I rely on the object(s) the foot is in contact with actually being an immobile part of the terrain. Thus there is no real way to pack all of the information about the state of the foot/ground interaction into a fixed-size input array.

Also, looking at existing papers on the subject, I see a lot of things use PD controllers, a strategy that is very much unlike a feed-forward neural network.

However, it seems as though nobody ever does ANNs with anything but feed-forward neural networks, so I haven't a clue how to train such an ANN. sad.png

There are a couple of factors that I imagine will affect whether or not a "good enough" coefficient matrix exists:

  1. The number of iterations it takes processing the "memory array" per physics tick
  2. The size of the "memory array"

I would be very much surprised if 8 iterations per physics tick and 100 memories were not sufficient for a "good enough" coefficient matrix to exist, but I have no idea how to find it. I've been trying to use genetic algorithms, but haven't had much progress.

I don't have a precise definition of what qualifies as "good enough", but I can tell from looking at it that everything the GA has been able to come up with so far hasn't been it.

And even more recently...

This past week in particular, I've been experimenting with what kind of control I can accomplish when neither foot is in contact with the ground. Even though it's a soldier in power armor, including a jetpack and rocketboots which realistically could be used to torque the whole system, the cat righting reflex proves that some control is possible even when there isn't anything to push off against. And in an FPS, the player needs to be able to change their heading and pitch even when they're airborne (though perhaps with slower turning rates).

First I tried just disabling scoring for the pelvis' position, then I disabled scoring for everything but the pelvis' orientation. Now I've broken the pelvis orientation score (formerly the magnitude squared of the difference between the desired angular velocity of the pelvis and the actual angular velocity the system achieved) into a separate scoring category for each x/y/z component of the error. The scores for the y component (y is my vertical axis) are worst for some reason, both before and after I've given the GA a chance to try to evolve anything.

I reckon it must be possible to have better control while airborne than what I've managed to achieve so far... but maybe I need to let the "airborne yaw/pitch system" (name I just now made up) affect the spinal joints' torques, not just the left/right ankle, knee, and hip torques?

If all else fails, I can cheat and give the Soldier Dood a gyroscope. But I won't be happy about it.

Signature go here.

A few comments. I don't know the details of your implementation so these may not apply but I hope it is useful:

If I am reading your implementation correctly, it looks like you're multiplying your ANN inputs across the coefficients to get outputs and running them through your squash function. The reason I don't think this would produce results is that the model seems to be missing hidden layer summing junctions, which are where the real computation in an ANN is accomplished. It would be very difficult to produce ANN-like behavior without a feedforward, 2-layer graph structure because without it your control inputs are basically just scaled products of your inputs.

It is completely feasible to have your input array be allocated to the largest size that it would possibly encounter and just feed your inputs in as 0.f values when they are unused, as long as you are always mapping the same inputs in to the same locations. When the GA converges it will automatically omit any unnecessary inputs.

Make sure to normalize your inputs to the interval [-1, 1] before pushing through the ANN. I usually just divide by the feasible domain.

ANN feedforward (if you choose to implement it) need only be accomplished once per frame as long as your delta-t values in the model all accurately reflect the operating frequency.

An ANN's success is only limited by the soundness of the fitness function you are giving the GA. Use easier-to-accomplish fitness functions at first and work your way up as it trains. Also, an ANN will converge very well on ONE behavior, so don't expect multiple control functions out of the same ANN. For multiple functions you would construct another ANN in parallel designed to converge to a different fitness function.

A trained feedforward ANN can produce exactly the same results as a PID controller, even though the architecture is different. If you have already worked out the transfer functions for every joint in the system though, then PD control might be even easier than what you're doing now. You may be able to develop inverse kinematics for desired behavior first and then just find controller parameters to minimize the disparity between actual and desired behaviors.

Ah whoops, I forgot I had a reply to reply to!

Hidden layers

My "memories" setup is just a more generalized form of your standard "hidden layers" construction. Instead of breaking the ANN down into layers, I categorize my neurons as either inputs (whose value is determined before batch-processing begins), memories (whose value may change with every iteration), or outputs (whose value is only used after batch-processing finishes). It also saves the values not just between iterations but also between batches of iterations, hence why I call them "memories". That and it's like a chunk of memory for use as "scratch paper", in whatever manner the ANN sees fit.

So it's capable of everything a traditional ANN with hidden layers is capable of, and then some. And if I wanted to enforce that certain coefficients in the coeff matrix must always be zero, I could make it exactly emulate a traditional "hidden layers" ANN.

"The largest size"

Yes, I suppose I could just have an arbitrarily large array of inputs, and have a reasonable "default value" to indicate that they aren't in use... but there really is no upper limit on how many contact points the feet can have at once. And even if I chose an arbitrarily large "maximum" number... how would I make the response "symmetric"? I.e. if there are multiple ways to represent virtually identical states, it should behave nearly identically regardless. I guess I could sort the contact points by priority somehow?

Also, a real person doesn't pack contact points into an array. Maybe I could attempt to categorize them by what part of the foot they're in contact with, what their normal vector is, etc.? Even categorizing the contact points like that, if I still had a separate array for each... I don't know, it sounds weird. Maybe I'm being too stubborn.

Normalizing my inputs

You bring up a good point, I haven't been (strictly) normalizing my inputs to [-1, 1]... I did realize that some of the goal-state quantities were way too big (because I was multiplying by 60hz to compute them), and so I multiplied them by something like 0.02 to compensate, but... I didn't want to run the inputs through tanh() unnecessarily, because it seems to me that if the ANN has use for the "squashed" inputs, it will do that itself, whereas if it would've been more useful to have the original linear values, it's going to have to do a bunch of extra work to un-squash that data.

Not sure what to make of this:

ANN feedforward (if you choose to implement it) need only be accomplished once per frame as long as your delta-t values in the model all accurately reflect the operating frequency.

"Only limited by the fitness function"

You can't actually mean that, can you? Sorry if this is bordering on nitpicking, but that "only" is so inaccurate I can't help but address it at length.

You said yourself you didn't think something without any hidden-layer neurons would be capable of solving this sort of problem. That's just the extreme case of "too few memories". Or do you mean to suggest letting it dynamically increase the number of memories? Hmm... I could try it, but I'd need convincing.

And the number of iterations matters too. Iterations are signal propagation... If I configured my ANN to emulate a traditional feed-forward hidden-layers ANN with 5 hidden layers, and it only did one iteration per physics tick, every action it takes will be 5 ticks late for the input that prompted it. That's only 83.3 ms, so it's actually better than the average human reaction time, but that assumes 5 hidden layers is sufficient. Even discounting the extreme case, "zero iterations", clearly the number of iterations is going to have an effect on the quality of the solutions.

And then there's the choice of inputs and outputs!

I don't know what it would look like, but I can say with reasonable confidence that there is some curve in (number of memories, number of iterations) space below which no coefficient matrix will be "good enough" (choice of inputs and outputs are implicit parameters). Though, as I said earlier, I would be very much surprised if (100, 8) is on the wrong side of that curve.

Multiple behaviors

The thing that makes this difficult is that I have a multifaceted goal state, all the parts of which need to be achieved simultaneously (I think). The most recent formulation of this goal state is a desired net force and torque vector on each of three bones: the left foot, the right foot, and the pelvis. And by "net" I mean what's measurable after the constraint solver has had its say, so it's equivalent to specifying a linear and angular velocity I want it to end up with, or a position and orientation. And in fact, a desired pos & ori is how I'm currently selecting the desired force and torque vectors... but in theory I can choose a goal state in any of those terms.

"Dynasties"

The thing is, I can't just let them evolve completely separately and then hope to simply lerp them together or something. If I don't force it to attempt multiple goals simultaneously, the strategies it comes up with to achieve one goal will come at the exclusion of the others.

I've been experimenting with a scheme for GA with multiple simultaneous scoring categories, which at one point I considered calling "dynasties". Inspiration for the idea came from a phrase I once encountered in a paper, "GA with villages". I didn't look up what it meant, but from the sound of it, I guessed it's a scheme for compromising between preserving genetic diversity (between "villages") and maintaining enough homogeneity for crossovers to be viable (within "villages").

Anyway here's how my "dynasties" scheme works:

For each scoring category there is a group of N parents, and every generation, each parent chooses a replacement for itself, which may either be an exact clone, a single-parent mutant, or a crossover with any of the parents in any of the categories. When there were only 1 or 2 parents per category, the label "dynasties" felt more appropriate, but when there's a lot, it's more like "nepotistic apprenticeships" mellow.png But whatever, it's not like I'm planning on patenting it.

I can't really tell if it's a step in the right direction. It definitely isn't "good enough" yet though, and I get the sense it's not going to get there just by being left to run for a few days.

Emulating PD with feed-forward ANN

I don't see how that's possible? To get values for the I and D terms, it needs to be able to remember what those components were from the last tick. If it stores "memories" separate from the normal inputs and outputs, doesn't it cease to qualify as "feed-forward"? Or is it a special case where some of the outputs were to be fed back in as inputs? Because that sounds exactly like the original premise that led me to come up with my "memories" system. If you do that, regardless of whether it qualifies as "feed-forward", the idea of having a "training set" of the correct outputs for a given set of inputs, including those memories as both inputs and outputs, becomes much more complicated, if not impossible. At least, I don't know how to do it.

Or did you just mean I would give it the values for P, I, and D as normal inputs, rather than any kind of "emulation"?

Aside: rules check: are my posts here too blog-like? I know some places have rules (or guidelines) against using the forum as your dev blag, and I posted this in the AI board (though it's since been moved to Math & Physics) rather than "Your Announcements"... But I just have so much to say!

Signature go here.

My "memories" setup is just a more generalized form of your standard "hidden layers" construction. Instead of breaking the ANN down into layers, I categorize my neurons as either inputs (whose value is determined before batch-processing begins), memories (whose value may change with every iteration), or outputs (whose value is only used after batch-processing finishes).

I'm sorry, I'm having trouble understanding the way you're defining it without seeing a discrete description of the structure. The best guess I have is that you're representing the structure as an adjacency matrix and iteratively modifying it, which is a possible solution, but I have no way of knowing exactly what your algorithm accomplishes without seeing a mathematical description and I'm having trouble finding any reason that one would need to iterate the algorithm more than once per frame.

(edits)

From what I think I understand about your implementation, I strongly suspect that there is no benefit to redundantly passing information through the structure while the input vector stays the same in each timestep (which is what I'm assuming is happening) because analytically you're passing new feature information through a set of weights that are meant to process other features. This can cause a lot of information to be lost and is why recurrent information is normally just fed in as an auxiilary input on the next timestep (aka recurrence). You may have better results by using another static coefficient matrix in lieu of "memories" and just feed the recurrent information you deem relevant in as separate inputs when your input vector is updated.

there really is no upper limit on how many contact points the feet can have at once

If this is true and you don't have a way to bound the data then you're limited by the finite size of your program memory and no algorithm is suitable anyway.

"Only limited by the fitness function"
You can't actually mean that, can you? Sorry if this is bordering on nitpicking, but that "only" is so inaccurate I can't help but address it at length.

Fair enough, let me be more specific: Assuming you are using the appropriate architecture for your objective, the viability of your model's hypothesis is only limited by your ability to select the appropriate fitness function.

You said yourself you didn't think something without any hidden-layer neurons would be capable of solving this sort of problem. That's just the extreme case of "too few memories". Or do you mean to suggest letting it dynamically increase the number of memories? Hmm... I could try it, but I'd need convincing.

And the number of iterations matters too. Iterations are signal propagation... If I configured my ANN to emulate a traditional feed-forward hidden-layers ANN with 5 hidden layers, and it only did one iteration per physics tick, every action it takes will be 5 ticks late for the input that prompted it. That's only 83.3 ms, so it's actually better than the average human reaction time, but that assumes 5 hidden layers is sufficient. Even discounting the extreme case, "zero iterations", clearly the number of iterations is going to have an effect on the quality of the solutions.

I think you may be misunderstanding the concept. The output set is a mapped function of the inputs at any given time regardless of how many layers there are. There is normally

no temporal aspect involved. Mathematically any function can be represented with only two layers so it seems that all you would be doing by repeatedly running information through tanh and multiplying it by the same matrix is corrupting your data in proportion to the number of iterations you run.

The thing that makes this difficult is that I have a multifaceted goal state, all the parts of which need to be achieved simultaneously

Anything that is necessary to occur at a given moment can be represented by a function. The hypothesis function (from whatever optimization/learning/controller) algorithm you use just has to change if you move to another function after. If actions have to occur in sequence to reach the desired goal then you can add recurrence.

I've been experimenting with a scheme for GA with multiple simultaneous scoring categories, which at one point I considered calling "dynasties"

This is called speciation, and it is already well-developed if you're interested in expanding on it. What you can't do is run GA on two disparate goals without partitioning the population, though. It will fail every time.

Emulating PD with feed-forward ANN

I don't see how that's possible? To get values for the I and D terms, it needs to be able to remember what those components were from the last tick. If it stores "memories" separate from the normal inputs and outputs, doesn't it cease to qualify as "feed-forward"?

I didn't mean that you combine the two. I meant that they are both methods to minimize error when comparing a system's state to its desired state.

(edit)

Actually, intentional emulation is possible because you can feed information from one iteration as input to the next iteration if you choose to do so. This is actually pretty common in physical systems. See "recurrence" above.

This topic is closed to new replies.

Advertisement