Facial Animation.

Started by
3 comments, last by bluntman 18 years, 3 months ago
I want to get the opinions of those in the industry about the future of facial animation in games, especially on next gen and PC platforms. I think currently all facial animation in games is done using simple clustering, or low resolution morph targets, but, with the advent of next gen consoles and super high end PCs, more complex techniques can be looked into. My opinions: Facial animation in games at the moment is pretty much uniformly unrealistic. I have never seen anything more complex than a raised eyebrow that actually looks real. The problem is that our brains are especially adept at picking up even the slightest nuances in peoples expressions (we have something like 50 muscles in our face), and if they are missing then the person looks relatively expressionless. The lack of realism is caused by a combination of the methods used to capture the animation data, and the methods used to apply it. Standard optical facial marker technology is not good enough to capture facial animation, particularly mouth animation, and simple clustering is not powerful enough to apply it realistically. Of course the clustering solutions can be scaled up to be more powerful, but at some point the complexity (power required to calculate mesh) outweighs the advantage over blend shapes (less storage capacity). Are the next gen consoles and top of the range PCs able to handle complex blend shape facial animation using meshes of 1000+ polygons? Are game developers interested enough in improving facial animation in games to sacrifice some precious fill rate and memory for the usage of complex facial animation? Any opinions appreciated!
Advertisement
I'd say the most elaborate facial animation system in a game was done by Valve in Half Life 2/Source Engine. They have a pretty eleborate system based on the muscles in the face (I think they use like 40+).

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Yeah I reckon HL2 has the best facial animation I have seen, but it still falls short of what is possible using the most cutting edge techniques.
I really want to know if people in the industry consider the current techniques to be adequate for the coming age of ultra realistic games, or if they need rethinking.
A significant amount of research and development time has been spent attempting to create realistic facial animations for in game characters. The problem isn't necessarily with the animation techniques at this point.

Animating facial expressions with bones (which are analogous to facial muscles in this case) allows us to create expressions essentially the same way they're created in real life. So, what I'm saying is, the problem isn't in the techniques themselves, it's simply a matter of using an adequate number of bones, and controlling those bones in such a way that seems more natural and realistic to us.

Your brain naturally knows if an expression is right or wrong but it's proven extremely difficult to directly convey those natural instincts into artist designed expressions and programmatically directed animation controllers. It's simply a matter of time and tweaking.

With that said, you should expect to see characters with more realistic animation (facial and otherwise) in upcoming next generation titles.
I think that behaviour of clustering systems is not accurate enough represent the complex surface of the face. I don't think it can ever be, as alot of the character of facial expression is created by the wrinkles, folds and dimples that appear when muscles move. This kind of complexity cannot be created with any clustering system that I know of.
Do most modern techniques in games employ animated texture maps? Using these would be one way around that problem, to a certain extent anyway.
Quote:
Your brain naturally knows if an expression is right or wrong but it's proven extremely difficult to directly convey those natural instincts into artist designed expressions and programmatically directed animation controllers. It's simply a matter of time and tweaking.

Which indicates that the best way to design facial expression is to capture them directly for someones face, rather than try to create them from scratch. I've been involved in creating a facial capture system for the company I work for, which basically creates a fully animated head model. The only type of animation it uses at the moment is per-vertex. i.e. its a per-frame blend sequence. This kind of thing is capable of containing alot more detail than a cluster based system, but at a fairly high bandwidth cost:
(1000 verticies * 25 fps * (4 floats per vert * 2 bytes per float + 3 bytes per normal)) / 1024 =
~270 kpbs
or
16 megs for a minute of dialog

The 1000 verts is assuming only the face patch, rather than the entire head, and the 2 bytes per float would be plenty accurate enough for a head model (I think!). I'm sure this data could be compressed aswell, although we haven't tried that.
At the moment this kind of system is used only for cutscene, TV and movie work, but how well would it fit into a modern/near future game engine?

This topic is closed to new replies.

Advertisement