Ventriloquism

Started by
7 comments, last by superpig 21 years, 10 months ago
It''s already been noted on this board - several times - that at the moment, speech recognition software is slow and inaccurate, and as such is no good for use as any major UI element in games. But given that voice communication over network games is becoming a fairly widespread technology, it looks a little strange to see all these players running around, talking, but none of their mouths are moving. It beomes difficult to tell who''s saying what, for example. So what if we were to use speech recognition on a basic level - that is, breaking the speech down into phonemes rather than words - and use the phonemes to do mouth animation? It wouldn''t necessarily be massively accurate, but it wouldn''t matter too much - and given that you''re not looking the phonetic word up in a dictionary to get the real word, I reckon there''d be a significant speed increase. What do people think? If a speech analyser could have the voice from the network being fed into it, then you''d effectively be able to see who''s talking. A further interesting application: if things are set up so that the voice comes ''from the player'' rather than just being played everywhere, then you could watch enemy characters from a distance, and try and lip-read to figure out what they''re saying. Superpig - saving pigs from untimely fates - sleeps in a ham-mock at www.thebinaryrefinery.cjb.net

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Advertisement
Not an entirely new technology, its called ''lip synch'', and its been implemented in some games, such as Counter-Strike. Valve done a pretty good job with it, the mouthes seem to move pretty consistantly with what is being said. Now that more and more games have build in voice-coms, I think we will start to see lip synch become more and more popular.
It''s already been done? Ah well... bye bye, patent pending

Seriously though, does anyone know of any libraries that will convert sound to phonemes? I don''t know of any free/OS libs that do speech recognition...

Superpig
- saving pigs from untimely fates
- sleeps in a ham-mock at www.thebinaryrefinery.cjb.net

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

For 3D game characters you probably don''t need voice recognition persay. If a character is talking, have the model loop through an animation state in which their lips are moving and maybe throw in a few hand gestures. Unless you plan on showing speaking characters in a close up their lips don''t need to be synced. The animation''s purpose is mainly to indicate which character on screen is talking.

Hope this helps.
I didn''t mean NPCs or scripted characters - I meant multiplayer games, where voice input from the other human players would be used as the input source.

Unless you meant that too, are are just suggesting that going to all the trouble of it would be a little pointless...?

Superpig
- saving pigs from untimely fates
- sleeps in a ham-mock at www.thebinaryrefinery.cjb.net

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

There is a free program out there that analyses speech and produces a lip-synch file you can then use to animate your characters mouths. I think it''s called Magpie.
"If you go into enough detail, everything becomes circular reasoning." - Captain Insanity
Yes, but that outputs to a file... can it be used in realtime?

Richard "Superpig" Fine - saving pigs from untimely fates - Microsoft DirectX MVP 2006/2007/2008/2009
"Shaders are not meant to do everything. Of course you can try to use it for everything, but it's like playing football using cabbage." - MickeyMouse

Yes, I meant that too. In most FPS or 3D adventure games I''ve played, I''m never close enough to other players that I would be able to tell if they''re accurately lip synced anyway... nor do I particularly care. What I do care about is just having some sort of generic animation that indicates which player is speaking.

Unless of course, like what you said about reading lips from far away, is vital to the design of the game. However, I think that it''s currently a bit too expensive to do in realtime.
Wow, its like football when no one can see the ball... but everyone can see there own game. Don''t you think?

This topic is closed to new replies.

Advertisement